Python - a programming language that is human readable, easy to learn, and most importantly, easy to use. It is no wonder that it is one of the most popular languages today. From being able to build web applications, create automation scripts, analyze data and build machine learning models, Python is a jack of all trades. As with anything it has its short comings which include its speed and lack of granular access to machine hardware. This is due to Python being an interpreter-based language and not a compiler-based language.
Luckily for us, there is a way to solve that problem. Python has a native library that allows you to use functions built in compiled languages. This gives you the ability to leverage both the powers of Python and compiler based languages. Some of the most popular data science libraries such as Pandas and XGBoost have modules written in C/C++ (compiler-based languages) to overcome Python’s speed and performance issues while still having a user friendly Python API.
In this post, we are going to walk through interpreter based languages vs. compiler based languages, Python’s built in ctypes
module and an example of how to use modules built from a compiled language. From there you will be able to integrate Python with other languages in your data science or machine learning projects.
Interpreters vs. Compilers
As I mentioned above, Python is an interpreted language. In order to execute code on a machine, the code first has to get translated to “machine code”.
The main difference between interpreted and compiled languages is that interpreted languages get executed line by line and passed through a translator that converts each line to machine code.
Compilers require a build step that translates all the code at once into an application (or binary) that can be executed. During the build phase there is a compiler that will optimize the translation from written code to machine code. The compiler’s optimizations are one of compiled based-languages are faster than interpreter-based languages. Common compiled languages are C, C++ and Go.
C Types
Python has built in libraries to be able to call modules built in compiled languages. It gives developers the ultimate flexibility to use compiled languages for tasks that Python isn’t well equipped to handle - all the while still building the core of the application in Python.
The main library that we will be using to interact with modules from a compiled application is ctypes. It provides C compatible data types and calling functions from applications built in compiled languages. It gives you the ability to wrap these languages in pure Python. To be able to do this, Python first has to convert its function argument types into C native types. This allows it to be compatible with compiled language function’s argument types. The figure below explains the process at a high level when interacting with functions from compiled languages.
Walk Through
Below is a Go function I wrote that gets the closing price of a stock using the Alpha Vantage API. I’ll go over the key lines that allow us to create a Python wrapper for it.
package main
import "C"
import (
"encoding/json"
"fmt"
"io/ioutil"
"log"
"net/http"
"os"
"time"
)
// APIKEY ... Alpha Vantage API key, stored as env variable
var APIKEY string = os.Getenv("ALPHA_API_KEY")
// ENDPOINT ... Alpha Vantage API daily stock data endpoint
var ENDPOINT string = "https://www.alphavantage.co/query?function=TIME_SERIES_DAILY_ADJUSTED"
// Data ... Data structure
type Data struct {
`json:"Meta Data"`
MetaData MetaData map[string]interface{} `json:"Time Series (Daily)"`
TimeSeries }
// MetaData ... Stock metadata structure
type MetaData struct {
string `json:"1. Information"`
Info string `json:"2. Symbol"`
Symbol string `json:"3. Last Refreshed"`
LastRefresh string `json:"4. Output Size"`
OutputSize string `json:"5. Time Zone"`
TimeZone }
// FinData ... Daily Financial Data json structure
type FinData struct {
string `json:"1. open"`
Open string `json:"2. high"`
High string `json:"3. low"`
Low string `json:"4. close"`
Close string `json:"5. adjusted close"`
AdjClose string `json:"6. volume"`
Volume string `json:"7. dividend amount"`
DivAmount string `json:"8. split coefficient"`
SplitCoeff }
func main() {}
//export getPrice
func getPrice(ticker *C.char, date *C.char) *C.char {
:= C.GoString(date)
tickerDate := C.GoString(ticker)
stock
:= fmt.Sprintf("%s&symbol=%s&apikey=%s", ENDPOINT, stock, APIKEY)
query
:= http.Client{
client : time.Second * 10, // Timeout after 5 seconds
Timeout}
, err := http.NewRequest(http.MethodGet, query, nil)
reqif err != nil {
.Fatal(err)
log}
.Header.Set("User-Agent", "stock-api-project")
req
, getErr := client.Do(req)
respif getErr != nil {
.Fatal(getErr)
log}
defer resp.Body.Close()
, _ := ioutil.ReadAll(resp.Body)
respBody
:= Data{}
dailyData .Unmarshal(respBody, &dailyData)
json
// Encode Interface as bytes
:= dailyData.TimeSeries[tickerDate]
dailyFinDataMap , _ := json.Marshal(dailyFinDataMap)
dfdByte
// Map interface to FinData struct
:= FinData{}
dailyFinData .Unmarshal(dfdByte, &dailyFinData)
json
return C.CString(dailyFinData.AdjClose)
}
import “C”
- Import the C
package (aka cgo) to have access to C data types.
func main() {}
- An empty main function to ensure we still have an executable.
//export getPrice
- Export the function so that we can expose it to be accessed by Python.
func getPrice(ticker *C.char, date *C.char) *C.char {
- The function arguments need to be C types.
tickerDate := C.GoString(date)
- Convert the function arguments to their native Go types.
return C.CString(dailyFinData.AdjClose)
- The return value has to be a native C type.
To be able to use this in Python we have to compile this program, go build -o stock-api.so -buildmode=c-shared main.go
Below is the associating Python wrapper that calls our exported Go function getPrice
.
from ctypes import *
def get_price(ticker: str) -> Dict[str, str]:
"""Gets the adjusted closing price of all stocks."""
= cdll.LoadLibrary("./stock-api.so")
lib = [c_char_p, c_char_p]
lib.getPrice.argtypes = c_char_p
lib.getPrice.restype
= str(date.today())
curr_date = lib.getPrice(ticker.encode(), curr_date.encode()).decode()
price
return price
from ctypes import *
- Import everything from ctypes as per the documentation.
lib = cdll.LoadLibrary("./stock-api.so")
- Load in the binary or compiled application.
lib.getPrice.argtypes = [c_char_p, c_char_p]
- Set the getPrice
function argument types to the corresponding C types.
lib.getPrice.restype = c_char_p
- Set the return type to the corresponding C type.
price = lib.getPrice(ticker.encode(), curr_date.encode()).decode()
- We call the getPrice
function and convert the Python string arguments into bytes that can be passed to the Go function by calling .encode
. We then receive the output from Go function as bytes so we decode
it to convert it to a Python string.
Conclusion
You now have most of the knowledge you need to get started creating wrappers for compiled language functions. No longer are you bound by the cons of interpreter based languages. You can create modules in languages that better suit your use case, and then fall back on Python to build out a user friendly API or the core of your application. Happy coding!
Feedback
I encourage any and all feedback about any of my posts and tutorials. You can message me on twitter or e-mail me at sidhuashton@gmail.com .