Last year, I wrote a blog post on the development and release of Type4Py. Type4Py is a machine learning model for code. In a nutshell, it predicts type annotations for Python source code files and enables developers to add types gradually to their codebases. At the time of the Type4Py release, its deployment was pretty simple. I didn’t use containerization (Docker) and Kubernetes, and the model was deployed on a single machine. There were two clear downsides to the initial deployment approach. First, I could not easily deploy the ML model and its pipeline on another machine. Because I had to install Type4Py and its dependencies on other machines, Second, the ML application could not be scaled well since a single machine’s resources are limited.
Over the past decade, machine learning (ML) has been applied successfully to a variety of tasks such as computer vision and natural language processing. Motivated by this, in recent years, researchers have employed ML techniques to solve code-related problems, including but not limited to, code completion, code generation, program repair, and type inference.
Dynamic programming languages like Python and TypeScript allows developers to optionally define type annotations and benefit from the advantages of static typing such as better code completion, early bug detection, and etc. However, retrofitting types is a cumbersome and error-prone process. To address this, we propose Type4Py, an ML-based type auto-completion for Python. It assists developers to gradually add type annotations to their codebases. In the following, I describe Type4Py’s pipeline, model, deployments, and the development of its VSCode extension and more.