Tutorial: Integrating Python with Compiler Languages

Create a Python wrapper for functions written in Go.
python
data science
go
machine learning
Author

Ashton Sidhu

Published

October 25, 2020

Python - a programming language that is human readable, easy to learn, and most importantly, easy to use. It is no wonder that it is one of the most popular languages today. From being able to build web applications, create automation scripts, analyze data and build machine learning models, Python is a jack of all trades. As with anything it has its short comings which include its speed and lack of granular access to machine hardware. This is due to Python being an interpreter-based language and not a compiler-based language.

Luckily for us, there is a way to solve that problem. Python has a native library that allows you to use functions built in compiled languages. This gives you the ability to leverage both the powers of Python and compiler based languages. Some of the most popular data science libraries such as Pandas and XGBoost have modules written in C/C++ (compiler-based languages) to overcome Python’s speed and performance issues while still having a user friendly Python API.

In this post, we are going to walk through interpreter based languages vs. compiler based languages, Python’s built in ctypes module and an example of how to use modules built from a compiled language. From there you will be able to integrate Python with other languages in your data science or machine learning projects.

Interpreters vs. Compilers

As I mentioned above, Python is an interpreted language. In order to execute code on a machine, the code first has to get translated to “machine code”.

Note

Machine code is just binary or hexadecimal instructions.

The main difference between interpreted and compiled languages is that interpreted languages get executed line by line and passed through a translator that converts each line to machine code.

Compilers require a build step that translates all the code at once into an application (or binary) that can be executed. During the build phase there is a compiler that will optimize the translation from written code to machine code. The compiler’s optimizations are one of compiled based-languages are faster than interpreter-based languages. Common compiled languages are C, C++ and Go.

Note

This a high level comparison between interpreters and compilers. More in-depth knowledge would require individual research or a separate topic entirely.

C Types

Python has built in libraries to be able to call modules built in compiled languages. It gives developers the ultimate flexibility to use compiled languages for tasks that Python isn’t well equipped to handle - all the while still building the core of the application in Python.

The main library that we will be using to interact with modules from a compiled application is ctypes. It provides C compatible data types and calling functions from applications built in compiled languages. It gives you the ability to wrap these languages in pure Python. To be able to do this, Python first has to convert its function argument types into C native types. This allows it to be compatible with compiled language function’s argument types. The figure below explains the process at a high level when interacting with functions from compiled languages.

Note

The example in the next section will be using Go so the image is Go specific, but the principle applies to the other compiled languages.

Walk Through

Below is a Go function I wrote that gets the closing price of a stock using the Alpha Vantage API. I’ll go over the key lines that allow us to create a Python wrapper for it.

package main

import "C"

import (
    "encoding/json"
    "fmt"
    "io/ioutil"
    "log"
    "net/http"
    "os"
    "time"
)

// APIKEY ... Alpha Vantage API key, stored as env variable
var APIKEY string = os.Getenv("ALPHA_API_KEY")

// ENDPOINT ... Alpha Vantage API daily stock data endpoint
var ENDPOINT string = "https://www.alphavantage.co/query?function=TIME_SERIES_DAILY_ADJUSTED"

// Data ... Data structure
type Data struct {
    MetaData   MetaData               `json:"Meta Data"`
    TimeSeries map[string]interface{} `json:"Time Series (Daily)"`
}

// MetaData ... Stock metadata structure
type MetaData struct {
    Info        string `json:"1. Information"`
    Symbol      string `json:"2. Symbol"`
    LastRefresh string `json:"3. Last Refreshed"`
    OutputSize  string `json:"4. Output Size"`
    TimeZone    string `json:"5. Time Zone"`
}

// FinData ... Daily Financial Data json structure
type FinData struct {
    Open       string `json:"1. open"`
    High       string `json:"2. high"`
    Low        string `json:"3. low"`
    Close      string `json:"4. close"`
    AdjClose   string `json:"5. adjusted close"`
    Volume     string `json:"6. volume"`
    DivAmount  string `json:"7. dividend amount"`
    SplitCoeff string `json:"8. split coefficient"`
}

func main() {}

//export getPrice
func getPrice(ticker *C.char, date *C.char) *C.char {

    tickerDate := C.GoString(date)
    stock := C.GoString(ticker)

    query := fmt.Sprintf("%s&symbol=%s&apikey=%s", ENDPOINT, stock, APIKEY)

    client := http.Client{
        Timeout: time.Second * 10, // Timeout after 5 seconds
    }

    req, err := http.NewRequest(http.MethodGet, query, nil)
    if err != nil {
        log.Fatal(err)
    }

    req.Header.Set("User-Agent", "stock-api-project")

    resp, getErr := client.Do(req)
    if getErr != nil {
        log.Fatal(getErr)
    }

    defer resp.Body.Close()

    respBody, _ := ioutil.ReadAll(resp.Body)

    dailyData := Data{}
    json.Unmarshal(respBody, &dailyData)

    // Encode Interface as bytes
    dailyFinDataMap := dailyData.TimeSeries[tickerDate]
    dfdByte, _ := json.Marshal(dailyFinDataMap)

    // Map interface to FinData struct
    dailyFinData := FinData{}
    json.Unmarshal(dfdByte, &dailyFinData)

    return C.CString(dailyFinData.AdjClose)
}

import “C” - Import the C package (aka cgo) to have access to C data types.

func main() {} - An empty main function to ensure we still have an executable.

//export getPrice - Export the function so that we can expose it to be accessed by Python.

func getPrice(ticker *C.char, date *C.char) *C.char { - The function arguments need to be C types.

tickerDate := C.GoString(date) - Convert the function arguments to their native Go types.

return C.CString(dailyFinData.AdjClose) - The return value has to be a native C type.

To be able to use this in Python we have to compile this program, go build -o stock-api.so -buildmode=c-shared main.go

Below is the associating Python wrapper that calls our exported Go function getPrice.

from ctypes import *

def get_price(ticker: str) -> Dict[str, str]:
    """Gets the adjusted closing price of all stocks."""

    lib = cdll.LoadLibrary("./stock-api.so")
    lib.getPrice.argtypes = [c_char_p, c_char_p]
    lib.getPrice.restype = c_char_p

    curr_date = str(date.today())
    price = lib.getPrice(ticker.encode(), curr_date.encode()).decode()

    return price

from ctypes import * - Import everything from ctypes as per the documentation.

lib = cdll.LoadLibrary("./stock-api.so") - Load in the binary or compiled application.

lib.getPrice.argtypes = [c_char_p, c_char_p] - Set the getPrice function argument types to the corresponding C types.

lib.getPrice.restype = c_char_p - Set the return type to the corresponding C type.

price = lib.getPrice(ticker.encode(), curr_date.encode()).decode() - We call the getPrice function and convert the Python string arguments into bytes that can be passed to the Go function by calling .encode. We then receive the output from Go function as bytes so we decode it to convert it to a Python string.

Conclusion

You now have most of the knowledge you need to get started creating wrappers for compiled language functions. No longer are you bound by the cons of interpreter based languages. You can create modules in languages that better suit your use case, and then fall back on Python to build out a user friendly API or the core of your application. Happy coding!

Feedback

I encourage any and all feedback about any of my posts and tutorials. You can message me on twitter or e-mail me at sidhuashton@gmail.com .