Quantcast
Channel: Web Science and Digital Libraries Research Group
Viewing all articles
Browse latest Browse all 737

2023-09-19: Remote Development in VSCode - Worth the Hassle?

$
0
0


Visual Studio Code (VSCode) has a suite of extensions that lets you write and execute code on remote locations such as servers, containers, or virtual machines through your web browser or locally installed VSCode. In this post, I will explain why this may be a good option depending on your needs.

Some advantages of remote development in VSCode are:

  • It enables developers to write and execute code in a different environment (e.g., Linux, Windows, macOS) without having to install or configure them locally.
  • It reduces the need to transfer files between machines, which can save time and bandwidth.
  • VSCode preserves its settings, extensions, and features when working remotely – This provides a consistent development experience across local and remote environments.
  • It allows developers to leverage remote computing resources (e.g., GPUs, CPUs, Nodes) for computationally heavy tasks like deep learning.

Some disadvantages of remote development in VSCode are:

  • It requires a stable and fast network connection to work smoothly, as network latency and interruption may affect the responsiveness of remote development.
  • The remote machines and the networks used to access them may introduce security risks. Developers should use secure protocols (e.g., SSH, HTTPS) and follow best practices for authentication and encryption.

A Bit of Background

I have been using PyCharm IDE for my research and collaborations for a long time. But when training deep learning models for some projects, I needed to use GPU resources beyond what my laptop provides. As a result, I started using our institutional HPC cluster (Wahab) to train models. At the beginning, I was developing locally, and then using scp / rclone / git CLIs to push my code onto Wahab, and executing them as SLURM jobs, or interactively through JupyterLab. Doing so came with three limitations.
  • When developing locally on my Mac, I had to push code to the HPC cluster, even to run the smallest of tests. To make matters worse, since the runtimes were different (MPS on Mac vs CUDA on Wahab), some bugs only surfaced when running on Wahab. This made my life difficult.
  • I had to use two different sets of commands to run locally and on the cluster. Also, I had to maintain two copies of virtual environments (Python) for local and remote development.
  • I had to keep copies of large datasets (100GB+) on my computer.
This is when I realized that if I'm developing on the cluster itself, then I overcome both problems. At first, I tried using JupyterLab directly, but soon I started missing the type-checking and code-completion features (e.g., Pylance) provided by VSCode. This motivated me to give remote development a try.

VSCode Remote Development

Remote Development is a plugin for VSCode that allows to use containers, remote machines, or Windows Subsystem for Linux (WSL) as full-featured development environments. Some of its key features are highlighted below:
  • Develop on the same operating system you deploy to or use larger or more specialized hardware.
  • Separate your development environment to avoid impacting your local machine configuration.
  • Make it easy for new contributors to get started and keep everyone in a consistent environment.
  • Use tools or runtimes not available on your local OS or manage multiple versions of them.
  • Develop your Linux-deployed applications using the Windows Subsystem for Linux.
  • Access an existing development environment from multiple machines or locations.
  • Debug an application running somewhere else such as a customer site or in the cloud.
Importantly, remote development eliminates the need for storing source code and data locally. It lets you run commands and other extensions in a container, in WSL, or on a remote machine, just like you would do locally, with no difference in look and feel.

The two screenshots below show a local development environment and a remote development environment for a single project.
Notice any difference? If not, you're correct, and that's the whole point! It feels the same as developing locally, including code completions, suggestions, etc. after installing the Python packages used in your project.


Code Completion during Remote Development

Hope I convinced you that remote development is as easy as local development, plus all the added benefits of being able to rapidly build and train models without the hassle of high CPU / GPU utilization and battery drain.

Setting Up VSCode for Remote Development

Now, let's see how to set up a remote development environment on your HPC cluster. Here, I'll focus on the Wahab Cluster. However, any cluster that uses SLURM would have a similar structure.

Step 1 - Install the required packages

The required packages are
However, I also recommend the "Peacock" package, and using the "Workspace" feature, which I will come back to later.


Step 2 - Setting Up SSH Configuration

Once these packages are installed, we need to configure how to connect to the remote servers. For this, you need to set up ssh configurations. For this, you need to open ~/.ssh/config and add the configurations to connect to these servers.
To avoid repeated password/2FA prompts, I added a wildcard configuration to keep any remote connection alive. This means that opening a connection will also create a background connection for the duration specified under ControlPersist. Here, yes means forever, but if required, you can also define how many seconds/minutes you want the background connection to last.

Next, I defined two hosts – (1) oducs and (2) Wahab. For this article, I will focus only on the Wahab cluster.

Step 3 - Connect to the Cluster

Now, let's see how to connect to the Wahab cluster. For this, you can click the "><" button at the bottom-left and then click "Connect to Host" or "Connect New Window to Host" on the popup that follows. Next. pick a host from what you added in Step 2, and if prompted, enter the username / password / OTP to establish the connection.


Voila! You're now connected to a remote instance through SSH. Simple, isn't it?

Now let's see how to set up a workspace and add multiple projects onto it. I would recommend using one, as it lets you (a) assign a virtual environment per project in the workspace, and (b) quickly navigate between projects. This simple hack drastically improved my workflow (e.g., code completion, refactoring).

Here's what it looks like when you first connect to the cluster. You can use the "Open Folder" button to open any folder from the editor.


Alternatively, you could create and use a workspace. A workspace allows you to maintain a set of folders from a single place. You could open a workspace via "File -> Open Workspace from File".


The peacock package lets you assign colors to each workspace, so that you can easily find which one you're currently on / switching to.

But, why? For instance, if you're working on multiple projects with similar file structures, it's likely that you might mix up and write code in the wrong project. Having a color-coded project structure helps to avoid such mistakes.


Hope this motivated you to delve into the world of remote development :)

-- Yasith Jayawardana (@yasithdev)


Viewing all articles
Browse latest Browse all 737

Trending Articles