Last year, I wrote a blog post on the development and release of Type4Py. Type4Py is a machine learning model for code. In a nutshell, it predicts type annotations for Python source code files and enables developers to add types gradually to their codebases. At the time of the Type4Py release, its deployment was pretty simple. I didn’t use containerization (Docker) and Kubernetes, and the model was deployed on a single machine. There were two clear downsides to the initial deployment approach. First, I could not easily deploy the ML model and its pipeline on another machine. Because I had to install Type4Py and its dependencies on other machines, Second, the ML application could not be scaled well since a single machine’s resources are limited.
Over the past decade, machine learning (ML) has been applied successfully to a variety of tasks such as computer vision and natural language processing. Motivated by this, in recent years, researchers have employed ML techniques to solve code-related problems, including but not limited to, code completion, code generation, program repair, and type inference.
Dynamic programming languages like Python and TypeScript allows developers to optionally define type annotations and benefit from the advantages of static typing such as better code completion, early bug detection, and etc. However, retrofitting types is a cumbersome and error-prone process. To address this, we propose Type4Py, an ML-based type auto-completion for Python. It assists developers to gradually add type annotations to their codebases. In the following, I describe Type4Py’s pipeline, model, deployments, and the development of its VSCode extension and more.
Nowadays, most people use scikit-learn for machine learning projects. Because scikit-learn is a top quality ML package for Python and lets you use a machine learning algorithm in several lines of Python code, which is great!
As a machine learning researcher, I personally like to try and use other machine learning libraries. It’s good to have knowledge of other ML libraries in your arsenal. Since I used C++ for my projects, I decided to try a C++ machine learning library.
Recently, I’ve introduced the LightTwinSVM program on my blog (If you haven’t read it, check out this post.). It is a fast and simple implementation of TwinSVM classifier. Some people might ask why I should use this program over other popular SVM’s implementation such as LIBSVM and scikit-learn. The short answer is that TwinSVM has better accuracy than that of SVM in most cases.
In order to show the effectiveness of the LightTwinSVM program in terms of accuracy, experiments were conducted on 10 UCI datasets benchmark datasets.
Support Vector Machine (SVM) was proposed by Vapnik and Cortes in 1995 . It is a very popular and powerful classification algorithm. The main idea of SVM is to find an optimal separating hyperplane between two classes. Due to SVM’s great classification ability, it has been applied to a wide variety of applications.
Over the past decade, scholars have proposed classifiers on the basis of SVM. Among the extensions of SVM, I’d like to introduce Twin Support Vector Machine (TSVM) . Because it has been received more attention.