Following on from our API 101 webinar in January, we’ve produced a roundup of some of the key pointers and questions that came out of the webinar. We’ve tried to keep this as concise as possible but caution, APIs can get quite technical quite quickly, so Google (or ChatGPT!) may well be your friend if you want to find out more detail.
What are the different API protocols and how should we choose between them?
In this article, we won’t be covering the basics of what Application Programming Interfaces (APIs) are – that’s what the webinar was for! So here the question raised is around some of the most common protocols used for APIs, and what the differences are.
In most cases, as an end user you won’t have any choice over the type of API you use; the software vendor builds the API, and you simply connect to it. However, in some cases there are options available, and if you do find yourself building your own API, it’s useful to know a bit about the protocols available.
On the webinar we focused mostly on REST (Representational State Transfer) APIs, which are probably the most common and simplest protocol. They take a straightforward client/server, “call and response” approach. Structurally, REST APIs are quite flexible which makes them easy to build and use. They are often used in public APIs, but they may not support all the functionality needed for some applications, and may be less secure than other protocols, and may not support all the functionality needed for certain applications.
SOAP (Simple Object Access Protocol) APIs are similar, but much more tightly structured. This can make them more difficult to work with and they can be a little slower than REST APIs. However, the rules that govern SOAP APIs improve inter-operability and support compliance standards. As a result, they are often used in scenarios where data security is important.
There is some ambiguity as to whether RPC (Remote Procedure Call) is a type of API, which comes down to complexities of implementation. Put simply, it’s a means for computers to send messages to each other to request functions are carried out. The protocols for RPC have been around for a long time and aren’t designed with data exchange in mind, but it can be used in this way if needed.
It is worth briefly touching on GraphQL, as a relatively new entrant to the world of API protocols. It can support multiple requests in a single API call, which reduces both the number of calls, and the tendency for REST or SOAP to return more information than is actually needed. It is, however, more complex and because of this can place additional load on the servers responding to requests.
Streaming APIs were also mentioned on the webinar – these are a bit different as they allow servers to “push” data to clients, rather than requiring the client to make repeated “pull” API calls. The server knows which clients are subscribed to the API, what information they require and on what trigger (time or event-based), and delivers data to the client on this basis. Apps that use push notifications on your phone (e.g. WhatsApp) will use this type of API.
What are the security risks of pulling data through an API and how would you control them?
APIs can be very secure, but it’s important to design them appropriately. The main risks are pretty much what one might expect: there is a risk that the API gives a user access to data that they shouldn’t have, and there is a risk that data transmitted via API is intercepted. During the webinar we very briefly touched on tokens and keys, which are a common way of ensuring security of data, though access to those keys must be strictly controlled, as anyone with the key can effectively make their own API calls. To control these risks, best practice is to authenticate the user, authorise the API through a key, and validate both using a token. Tokens typically have a set duration or are tied to specific users or devices, ensuring that only authorised users can access data through the API.
Encryption is another crucial consideration for securing API data. Although encryption is generally easy to build into APIs, it is not always present by default. Encryption helps ensure that data transmitted via the API is secure and protected from interception. User permissions can also be employed to limit access to specific resources. Lastly, organisations that rely heavily on APIs should have a central repository to catalogue and manage them.
What is "DDOS"?
During the webinar, we briefly touched on Distributed Denial of Service (DDOS) attacks in the context of API security. DDOS attacks are a type of cyber-attack where a server is flooded with requests from a wide range of sources in an attempt to overwhelm it. They are often carried out using a network of compromised devices, controlled remotely. DDOS attacks on servers that manage API calls can impact the server’s ability to respond to legitimate requests.
API developers should be aware of the potential for such attacks and take measures to limit their impact. One common approach is to limit the number of requests that users can make within certain timeframes (per second, hour or day). Another approach is to throttle the speed of the API, limiting the rate at which records are returned. These measures can help precent overload and ensure that the API remains available and responsive, but if implemented poorly can lead to a poor user experience.
Can you restrict API use, so that people can only use GET (like a read only access version)?
Yes, it’s possible to restrict API use to read-only access. Many public APIs are designed only to permit users to read data and not write to the server. APIs between systems are usually defined very clearly as to whether the API is read-only or read-write. In some cases, the end user can choose the level of permission that the requesting system is given. For example, when you connect a third party app to your Facebook or Google account, or to Xero, the prompt that lets you customise/give permission to the app is essentially configuring the API rights for that app.
Even if an API is designed as a read-write API, it can often be restricted such that the user’s own credentials and permissions govern what they can and cannot do. For example, an API can be configured to only allow a user to modify their own data or to only allow certain users to access certain resources. However, it’s important to note that this requires careful design, as APIs can also be used as a means of bypassing user permissions directly within the system. Ultimately, APIs need to be managed so that users only have access to the data and functions they need to perform their tasks.
How do we ensure that the data transferred via an API is complete and accurate?
Providing the connection is working, an API call will always generate some sort of response from the server, even if the call is incorrect. This means it is very unusual to make a correct API call, but receive an incomplete response, unless the connection to the server has failed during the data transmission.
To support the accuracy and completeness of API responses, it's common to break down the data into manageable chunks or “pages” of data – perhaps 1000 records at a time. We touched on this briefly when demoing APIs in the webinar. While it may seem counterintuitive to make many smaller API calls it can be the most efficient and reliable approach, when structured correctly.
Even once a complete API response is received, the data should still be validated, just like any other method of obtaining information. This can be done by checking against other information sources, such as obtaining another comparable dataset (e.g. a trial balance to compare to the general ledger), or corroborating against third party, in true audit fashion. APIs can also be designed to include validation mechanisms, such as checksums or data validation rules.
Is it possible to use APIs to grab parameters from one source, then include those parameters in a GET request to a second source?
Absolutely! In fact this approach to APIs is extremely common. Making an initial “metadata” request can help determine the appropriate scope and criteria for the main API call. As most APIs are designed not to simply return all available records, it’s often necessary to make a call to determine the total number of records and then use this information to inform the main API call.
Another example of this approach is with data extraction, where an initial API request is used to determine the appropriate scope using criteria such as available companies, date ranges or accounts. Users can then select the ‘in scope’ criteria, prompting a second API call to perform the actual extraction based on this. In doing so, it helps to reduce errors and streamline the data extraction process.
What is a suitable tool to call/run an API transaction?
There are many tools available that can be used to call APIs, each with their own strengths and weaknesses. As a starting point, the ‘Get Data’ option in Excel is able to make API requests (as demonstrated on the webinar), which can be useful for simple data retrieval tasks. Other software such as Alteryx or KNIME can also make API calls, but none of these tools are not designed for the secure management of access keys and tokens, which is a likely requirement of more complex API use cases. For more complex API use cases, Python can be a good option as it offers greater flexibility and control over the API transaction process. Tools such as Databricks or Microsoft Visual Studio also offer built-in key and token management capabilities, making them more suitable for enterprise-level API transactions.
There are dedicated tools such as Postman for designing and building APIs, which can be useful for testing API calls (also briefly demonstrated). It offers a user-friendly interface for creating, managing, and testing API requests, making it a popular choice among developers.
When choosing a tool to call/run an API transaction, it's important to consider the specific needs and requirements of the project. Factors such as security, ease of use, and integration with other tools should all be taken into account to ensure the most suitable tool is selected.
Is it possible to make API calls in Google Sheets?
Can WinSCP be used to run an API?
Can you use APIs to automate data analysis for finance systems like SAP?
Whether or not you can use APIs to automate data analysis for finance systems like SAP depends on the hosting location of the system and whether it supports making calls through APIs.
APIs are typically web-based interfaces, so cloud-hosted finance systems are better suited to make and receive API calls. For instance, platforms like Xero, Sage and Quickbooks offer ‘marketplaces’ where third party applications can be integrated with their systems using APIs. These integrations are subject to tight control to reduce security and performance risks, and often only approved solutions are allowed to connect through APIs to their systems.
APIs can be used for server-to-server integrations within internal systems, but the protocols used are often system-specific, and not all systems, especially older ones, support APIs. SAP has its own programming language (ABAP) and a relatively opaque database structure. As a result, specialist knowledge and tools are typically required to interact with SAP systems, though APIs are one of a number of options available for external connections.
Can we get this excel file to refresh automatically, for example, daily?
During the webinar we demonstrated how to use Power Query in Excel to access a simple public API. It’s a common misconception that Power Query must be manually refreshed, as it’s actually possible to configure it to refresh automatically with a simple configuration change.
To do this, first set up the API call as demonstrated on the webinar. Then in Excel, go to Data > Queries & Connections. Right click on the query you want to refresh and select Properties to open the Query Properties window. Here, you’ll see options to “Refresh every” to set a scheduled refresh, or "Refresh data when opening the file".