Using Visual Studio Code For Python



Once discovered, Visual Studio Code provides a variety of means to run tests and debug tests. VS Code displays test output in the Python Test Log panel, including errors caused when a test framework is not installed. With pytest, failed tests also appear in the Problems panel. A little background on unit testing #. Open the project folder in VS Code by running VS Code and using the File Open Folder command. Once VS Code launches, open the Command Palette (View Command Palette or ⇧⌘P (Windows, Linux Ctrl+Shift+P)). Then select the Python: Select Interpreter command. To make the VS Code works with Python, you need to install the Python extension from the Visual Studio Marketplace. The following picture illustrates the steps: First, click the Extensions tab. Second, type the python keyword on the search input. Third, click the Python extension. In Visual Studio, select File New Project (Ctrl + Shift + N), which brings up the New Project dialog. Here you browse templates across different languages, then select one for your project and specify where Visual Studio places files. To view Python templates, select Installed Python on.

The Python extension supports testing with Python's built-in unittest framework as well as pytest. Nose is also supported, although the framework itself is in maintenance mode.

After enabling a test framework, use the Python: Discover Tests command to scan the project for tests according to the discovery patterns of the currently selected test framework. Once discovered, Visual Studio Code provides a variety of means to run tests and debug tests. VS Code displays test output in the Python Test Log panel, including errors caused when a test framework is not installed. With pytest, failed tests also appear in the Problems panel.

A little background on unit testing

(If you're already familiar with unit testing, you can skip to the walkthroughs.)

A unit is a specific piece of code to be tested, such as a function or a class. Unit tests are then other pieces of code that specifically exercise the code unit with a full range of different inputs, including boundary and edge cases.

For example, say you have a function to validate the format of an account number that a user enters in a web form:

Unit tests are concerned only with the unit's interface—its arguments and return values—not with its implementation (which is why no code is shown here in the function body; often you'd be using other well-tested libraries to help implement the function). In this example, the function accepts any string and returns true if that string contains a properly formatted account number, false otherwise.

To thoroughly test this function, you want to throw at it every conceivable input: valid strings, mistyped strings (off by one or two characters, or containing invalid characters), strings that are too short or too long, blank strings, null arguments, strings containing control characters (non-text codes), string containing HTML, strings containing injection attacks (such as SQL commands or JavaScript code), and so on. It's especially important to test security cases like injection attacks if the validated string is later used in database queries or displayed in the app's UI.

For each input, you then define the function's expected return value (or values). In this example, again, the function should return true for only properly formatted strings. (Whether the number itself is a real account is a different matter that would be handled elsewhere through a database query.)

With all the arguments and expected return values in hand, you now write the tests themselves, which are pieces of code that call the function with a particular input, then compare the actual return value with the expected return value (this comparison is called an assertion):

The exact structure of the code depends on the test framework you're using, and specific examples are provided later in this article. In any case, as you can see, each test is very simple: invoke the function with an argument and assert the expected return value.

The combined results of all the tests is your test report, which tells you whether the function (the unit), is behaving as expected across all test cases. That is, when a unit passes all of its tests, you can be confident that it's functioning properly. (The practice of test-driven development is where you actually write the tests first, then write the code to pass more and more tests until all of them pass.)

Because unit tests are small, isolated piece of code (in unit testing you avoid external dependencies and use mock data or otherwise simulated inputs), they're quick and inexpensive to run. This characteristic means that you can run unit tests early and often. Developers typically run unit tests even before committing code to a repository; gated check-in systems can also run unit tests before merging a commit. Many continuous integration systems also run unit tests after every build. Running the unit test early and often means that you quickly catch regressions, which are unexpected changes in the behavior of code that previously passed all its unit tests. Because the test failure can easily be traced to a particular code change, it's easy to find and remedy the cause of the failure, which is undoubtedly better than discovering a problem much later in the process!

For a general background on unit testing, see Unit Testing on Wikipedia. For a variety of useful unit test examples, see https://github.com/gwtw/py-sorting, a repository with tests for different sorting algorithms.

Example test walkthroughs

Python tests are Python classes that reside in separate files from the code being tested. Each test framework specifies the structure and naming of tests and test files. Once you write tests and enable a test framework, VS Code locates those tests and provides you with various commands to run and debug them.

For this section, create a folder and open it in VS Code. Then create a file named inc_dec.py with the following code to be tested:

With this code, you can experience working with tests in VS Code as described in the sections that follow.

Enable a test framework

Testing in Python is disabled by default. To enable testing, use the Python: Configure Tests command on the Command Palette. This command prompts you to select a test framework, the folder containing tests, and the pattern used to identify test files.

You can also configure testing manually by setting one and only one of the following settings to true: python.testing.unittestEnabled, python.testing.pytestEnabled, and python.testing.nosetestsEnabled. Each framework also has specific configuration settings as described under Test configuration settings for their folders and patterns.

It's important that you enable only a single test framework at a time. For this reason, when you enable one framework also be sure to disable the others. The Python: Configure Tests command does this automatically.

When you enable a test framework, VS Code prompts you to install the framework package if it's not already present in the currently activated environment:

Create tests

Each test framework has its own conventions for naming test files and structuring the tests within, as described in the following sections. Each case includes two test methods, one of which is intentionally set to fail for the purposes of demonstration.

Because Nose is in maintenance mode and not recommended for new projects, only unittest and pytest examples are shown in the sections that follow. (Nose2, the successor to Nose, is just unittest with plugins, and so it follows the unittest patterns shown here.)

Using Visual Studio Code For Python

Tests in unittest

Create a file named test_unittest.py that contains a test class with two test methods:

Tests in pytest

Create a file named test_pytest.py that contains two test methods:

Test discovery

VS Code uses the currently enabled testing framework to discover tests. You can trigger test discovery at any time using the Python: Discover Tests command.

python.testing.autoTestDiscoverOnSaveEnabled is set to true by default, meaning test discovery is performed automatically whenever you save a test file. To disable this feature, set the value to false.

Test discovery applies the discovery patterns for the current framework (which can be customized using the Test configuration settings). The default behavior is as follows:

  • python.testing.unittestArgs: Looks for any Python (.py) file with 'test' in the name in the top-level project folder. All test files must be importable modules or packages. You can customize the file matching pattern with the -p configuration setting, and customize the folder with the -t setting.

  • python.testing.pytestArgs: Looks for any Python (.py) file whose name begins with 'test_' or ends with '_test', located anywhere within the current folder and all subfolders.

Tip: Sometimes tests placed in subfolders aren't discovered because such test files cannot be imported. To make them importable, create an empty file named __init__.py in that folder.

If discovery succeeds, the status bar shows Run Tests instead:

If discovery fails (for example, the test framework isn't installed), you see a notification on the status bar. Selecting the notification provides more information:

Once VS Code recognizes tests, it provides several ways to run those tests as described in Run tests. The most obvious means are CodeLens adornments that appear directly in the editor and allow you to easily run a single test method or, with unittest, a test class:

Note: At present, the Python extension doesn't provide a setting to turn the adornments on or off. To suggest a different behavior, file an issue on the vscode-python repository.

For Python, test discovery also activates the Test Explorer with an icon on the VS Code activity bar. The Test Explorer helps you visualize, navigate, and run tests:

Python

Run tests

You run tests using any of the following actions:

  • With a test file open, select the Run Test CodeLens adornment that appears above a test method or a class, as shown in the previous section. This command runs only that one method or only those tests in the class.

  • Select Run Tests on the Status Bar (which can change appearance based on results),

    then select one of the commands like Run All Tests or Discover Tests:

  • In Test Explorer:

    • To run all discovered tests, select the play button at the top of Test Explorer:

    • To run a specific group of tests, or a single test, select the file, class, or test, then select the play button to the right of that item:

  • Right-click a file in Explorer and select Run All Tests, which runs the tests in that one file.

  • From the Command Palette, select any of the run test commands:

    CommandDescription
    Debug All TestsSee Debug tests.
    Debug Test MethodSee Debug tests.
    Run All TestsSearches for and runs all tests in the workspace and its subfolders.
    Run Current Test FileRuns the test in the file that's currently viewed in the editor.
    Run Failed TestsRe-runs any tests that failed in a previous test run. Runs all test if no tests have been run yet.
    Run Test FilePrompts for a specific test filename, then runs the test in that file.
    Run Test MethodPrompts for the name of a test to run, providing auto-completion for test names.
    Show Test OutputOpens the Python Test Log panel with information about passing and failing tests, as well as errors and skipped tests.

After a test run, VS Code displays results directly with the CodeLens adornments in the editor and in Test Explorer. Results are shown both for individual tests as well as any classes and files containing those tests. Failed tests are also adorned in the editor with a red underline.

VS Code also shows test results in the Python Test Log output panel (use the View > Output menu command to show the Output panel, then select Python Test Log from the dropdown on the right side):

With pytest, failed tests also appear in the Problems panel, where you can double-click on an issue to navigate directly to the test:

Run tests in parallel

Support for running tests in parallel with pytest is available through the pytest-xdist package. To enable parallel testing:

  1. Open the integrated terminal and install the pytest-xdist package. For more details refer to the project's documentation page.

    For Windows

    For macOS/Linux

  2. Next, create a file named pytest.ini in your project directory and add the content below, specifying the number of CPUs to be used. For example, to set it up for 4 CPUs:

  3. Run your tests, which will now be run in parallel.

Code

Debug tests

You might occasionally need to step through and analyze tests in the debugger, either because the tests themselves have a code defect you need to track down or in order to better understand why an area of code being tested is failing.

For example, the test_decrement functions given earlier are failing because the assertion itself is faulty. The following steps demonstrate how to analyze the test:

  1. Set a breakpoint on first the line in the test_decrement function.

  2. Select the Debug Test adornment above that function or the 'bug' icon for that test in Test Explorer. VS Code starts the debugger and pauses at the breakpoint.

  3. In the Debug Console panel, enter inc_dec.decrement(3) to see that the actual result is 2, whereas the expected result specified in the test is the incorrect value of 4.

  4. Stop the debugger and correct the faulty code:

  5. Save the file and run the tests again to confirm that they pass, and see that the CodeLens adornments also indicate passing status.

    Note: running or debugging a test does not automatically save the test file. Always be sure to save changes to a test before running it, otherwise you'll likely be confused by the results because they still reflect the previous version of the file!

Using Visual Studio Code For Python

Using Visual Studio Code For Python

The Python: Debug All Tests and Python: Debug Test Method commands (on both the Command Palette and Status Bar menu) launch the debugger for all tests and a single test method, respectively. You can also use the 'bug' icons in Test Explorer to launch the debugger for all tests in a selected scope as well as all discovered tests.

The debugger works the same for tests as for other Python code, including breakpoints, variable inspection, and so on. To customize settings for debugging tests, you can specify 'request':'test' in the launch.json file in the .vscode folder from your workspace. This configuration will be used when you run Python: Debug All Tests and Python: Debug Test Method commands.

For example, the configuration below in the launch.json file disables the justMyCode setting for debugging tests:

If you have more than one configuration entry with 'request':'test', the first definition will be used since we currently don't support multiple definitions for this request type.

For more information on debugging, see Python debugging configurations and the general VS Code Debugging article.

Test configuration settings

The behavior of testing with Python is driven by both general settings and settings that are specific to whichever framework you've enabled.

General settings

Setting
(python.testing.)
DefaultDescription
autoTestDiscoverOnSaveEnabledtrueSpecifies whether to enable or disable auto run test discovery when saving a test file.
cwdnullSpecifies an optional working directory for tests.
debugPort3000Port number used for debugging of unittest tests.
promptToConfiguretrueSpecifies whether VS Code prompts to configure a test framework if potential tests are discovered.

unittest configuration settings

Setting
(python.testing.)
DefaultDescription
unittestEnabledfalseSpecifies whether unittest is enabled as the test framework. All other frameworks should be disabled.
unittestArgs['-v', '-s', '.', '-p', '*test*.py']Arguments to pass to unittest, where each element that's separated by a space is a separate item in the list. See below for a description of the defaults.

The default arguments for unittest are as follows:

  • -v sets default verbosity. Remove this argument for simpler output.
  • -s . specifies the starting directory for discovering tests. If you have tests in a 'test' folder, change the argument to -s test (meaning '-s', 'test' in the arguments array).
  • -p *test*.py is the discovery pattern used to look for tests. In this case, it's any .py file that includes the word 'test'. If you name test files differently, such as appending '_test' to every filename, then use a pattern like *_test.py in the appropriate argument of the array.

To stop a test run on the first failure, add the fail fast option '-f' to the arguments array.

See unittest command-line interface for the full set of available options.

pytest configuration settings

Setting
(python.testing.)
DefaultDescription
pytestEnabledfalseSpecifies whether pytest is enabled as the test framework. All other frameworks should be disabled.
pytestPath'pytest'Path to pytest. Use a full path if pytest is located outside the current environment.
pytestArgs[]Arguments to pass to pytest, where each element that's separated by a space is a separate item in the list. See pytest command-line options.

You can also configure pytest using a pytest.ini file as described on pytest Configuration.

Note If you have the pytest-cov coverage module installed, VS Code doesn't stop at breakpoints while debugging because pytest-cov is using the same technique to access the source code being run. To prevent this behavior, include --no-cov in pytestArgs when debugging tests, for example by adding 'env': {'PYTEST_ADDOPTS': '--no-cov'} to your debug configuration. (See Debug Tests above about how to set up that launch configuration.) (For more information, see Debuggers and PyCharm in the pytest-cov documentation.)

Nose configuration settings

Using Visual Studio Code For Python Programming

Setting
(python.testing.)
DefaultDescription
nosetestsEnabledfalseSpecifies whether Nose is enabled as the test framework. All other frameworks should be disabled.
nosetestPath'nosetests'Path to Nose. Use a full path if Nose is located outside the current environment.
nosetestArgs[]Arguments to pass to Nose, where each element that's separated by a space is a separate item in the list. See Nose usage options.

You can also configure nose with a .noserc or nose.cfg file as described on Nose configuration.

See also

  • Python environments - Control which Python interpreter is used for editing and debugging.
  • Settings reference - Explore the full range of Python-related settings in VS Code.

This tutorial demonstrates using Visual Studio Code and the Microsoft Python extension with common data science libraries to explore a basic data science scenario. Specifically, using passenger data from the Titanic, you will learn how to set up a data science environment, import and clean data, create a machine learning model for predicting survival on the Titanic, and evaluate the accuracy of the generated model.

Prerequisites

The following installations are required for the completion of the tutorial. If you do not have them already, install them prior to beginning.

Using Visual Studio Code For Python Tutorial

Visual
  • The Python extension for VS Code from the Visual Studio Marketplace. For additional details on installing extensions, see Extension Marketplace. The Python extension is named Python and published by Microsoft.

  • Note: If you already have the full Anaconda distribution installed, you don't need to install Miniconda. Alternatively, if you'd prefer not to use Anaconda or Miniconda, you can create a Python virtual environment and install the packages needed for the tutorial using pip. If you go this route, you will need to install the following packages: pandas, jupyter, seaborn, scikit-learn, keras, and tensorflow.

Set up a data science environment

Visual Studio Code and the Python extension provide a great editor for data science scenarios. With native support for Jupyter notebooks combined with Anaconda, it's easy to get started. In this section, you will create a workspace for the tutorial, create an Anaconda environment with the data science modules needed for the tutorial, and create a Jupyter notebook that you'll use for creating a machine learning model.

  1. Begin by creating an Anaconda environment for the data science tutorial. Open an Anaconda command prompt and run conda create -n myenv python=3.7 pandas jupyter seaborn scikit-learn keras tensorflow to create an environment named myenv. For additional information about creating and managing Anaconda environments, see the Anaconda documentation.

  2. Next, create a folder in a convenient location to serve as your VS Code workspace for the tutorial, name it hello_ds.

  3. Open the project folder in VS Code by running VS Code and using the File > Open Folder command.

  4. Once VS Code launches, open the Command Palette (View > Command Palette or ⇧⌘P (Windows, Linux Ctrl+Shift+P)). Then select the Python: Select Interpreter command:

  5. The Python: Select Interpreter command presents the list of available interpreters that VS Code was able to locate automatically (your list will vary from the one shown below; if you don't see the desired interpreter see Configuring Python environments). From the list, select the Anaconda environment you created, which should include the text 'myenv': conda.

  6. With the environment and VS Code setup, the final step is to create the Jupyter notebook that will be used for the tutorial. Open the Command Palette (⇧⌘P (Windows, Linux Ctrl+Shift+P)) and select Jupyter: Create New Blank Jupyter Notebook.

    Note: Alternatively, from the VS Code File Explorer, you can use the New File icon to create a Notebook file named hello.ipynb.

  7. Use the Save icon on the main notebook toolbar to save the notebook with the filename hello.

  8. After your file is created, you should see the open Jupyter notebook in the native notebook editor. For additional information about native Jupyter notebook support, see this section of the documentation.

Prepare the data

Using Visual Studio Code For Python Projects

This tutorial uses the Titanic dataset available on OpenML.org, which is obtained from Vanderbilt University's Department of Biostatistics at http://biostat.mc.vanderbilt.edu/DataSets. The Titanic data provides information about the survival of passengers on the Titanic, as well as characteristics about the passengers such as age and ticket class. Using this data, the tutorial will establish a model for predicting whether a given passenger would have survived the sinking of the Titanic. This section shows how to load and manipulate data in your Jupyter notebook.

  1. To begin, download the Titanic data from OpenML.org as a csv file named data.csv and save it to the hello_ds folder that you created in the previous section.

  2. In VS Code, open the hello_ds folder and the Jupyter notebook (hello.ipynb), by going to File > Open Folder.

  3. Within your Jupyter notebook begin by importing the pandas and numpy libraries, two common libraries used for manipulating data, and loading the Titanic data into a pandas DataFrame. To do so, copy the below code into the first cell of the notebook. For additional guidance about working with Jupyter notebooks in VS Code, see the Working with Jupyter Notebooks documentation.

  4. Now, run the cell using the Run cell icon or the Shift+Enter shortcut.

  5. After the cell finishes running, you can view the data that was loaded using the variable explorer and data viewer. First click on the chart icon in the notebook's upper toolbar, then the data viewer icon to the right of the data variable. For additional information about the data set, refer to this document about how it was constructed.

    You can then use the data viewer to view, sort, and filter the rows of data. After reviewing the data, it can then be helpful to graph some aspects of it to help visualize the relationships between the different variables.

  6. Before the data can be graphed though, you need to make sure that there aren't any issues with it. If you look at the Titanic csv file, one thing you'll notice is that a question mark ('?') was used to designate cells where data wasn't available.

    While Pandas can read this value into a DataFrame, the result for a column like Age is that its data type will be set to Object instead of a numeric data type, which is problematic for graphing.

    This problem can be corrected by replacing the question mark with a missing value that pandas is able to understand. Add the following code to the next cell in your notebook to replace the question marks in the age and fare columns with the numpy NaN value. Notice that we also need to update the column's data type after replacing the values.

    Tip: To add a new cell you can use the insert cell icon that's in the bottom left corner of an existing cell. Alternatively, you can also use the Esc to enter command mode, followed by the B key.

    Note: If you ever need to see the data type that has been used for a column, you can use the DataFrame dtypes attribute.

  7. Now that the data is in good shape, you can use seaborn and matplotlib to view how certain columns of the dataset relate to survivability. Add the following code to the next cell in your notebook and run it to see the generated plots.

    Note: To better view details on the graphs, you can open them in plot viewer by hovering over the upper left corner of the graph and clicking the button that appears.

  8. These graphs are helpful in seeing some of the relationships between survival and the input variables of the data, but it's also possible to use pandas to calculate correlations. To do so, all the variables used need to be numeric for the correlation calculation and currently gender is stored as a string. To convert those string values to integers, add and run the following code.

  9. Now, you can analyze the correlation between all the input variables to identify the features that would be the best inputs to a machine learning model. The closer a value is to 1, the higher the correlation between the value and the result. Use the following code to correlate the relationship between all variables and survival.

  10. Looking at the correlation results, you'll notice that some variables like gender have a fairly high correlation to survival, while others like relatives (sibsp = siblings or spouse, parch = parents or children) seem to have little correlation.

    Let's hypothesize that sibsp and parch are related in how they affect survivability, and group them into a new column called 'relatives' to see whether the combination of them has a higher correlation to survivability. To do this, you will check if for a given passenger, the number of sibsp and parch is greater than 0 and, if so, you can then say that they had a relative on board.

    Use the following code to create a new variable and column in the dataset called relatives and check the correlation again.

  11. You'll notice that in fact when looked at from the standpoint of whether a person had relatives, versus how many relatives, there is a higher correlation with survival. With this information in hand, you can now drop from the dataset the low value sibsp and parch columns, as well as any rows that had NaN values, to end up with a dataset that can be used for training a model.

    Note: Although age had a low direct correlation, it was kept because it seems reasonable that it might still have correlation in conjunction with other inputs.

Train and evaluate a model

With the dataset ready, you can now begin creating a model. For this section you'll use the scikit-learn library (as it offers some useful helper functions) to do pre-processing of the dataset, train a classification model to determine survivability on the Titanic, and then use that model with test data to determine its accuracy.

  1. A common first step to training a model is to divide up the dataset into training and validation data. This allows you to use a portion of the data to train the model and a portion of the data to test the model. If you used all your data to train the model, you wouldn't have a way to estimate how well it would actually perform against data the model has not yet seen. A benefit of the scikit-learn library is that it provides a method specifically for splitting a dataset into training and test data.

    Add and run a cell with the following code to the notebook to split up the data.

  2. Next, you'll normalize the inputs such that all features are treated equally. For example, within the dataset the values for age range from ~0-100, while gender is only a 1 or 0. By normalizing all the variables, you can ensure that the ranges of values are all the same. Use the following code in a new code cell to scale the input values.

  3. There are a number of different machine learning algorithms that you could choose from to model the data and scikit-learn provides support for a number of them, as well as a chart to help select the one that's right for your scenario. For now, use the Naïve Bayes algorithm, a common algorithm for classification problems. Add a cell with the following code to create and train the algorithm.

  4. With a trained model, you can now try it against the test data set that was held back from training. Add and run the following code to predict the outcome of the test data and calculate the accuracy of the model.

    Looking at the result of the test data, you'll see that the trained algorithm had a ~75% success rate at estimating survival.

(Optional) Use a neural network to increase accuracy

A neural network is a model that uses weights and activation functions, modeling aspects of human neurons, to determine an outcome based on provided inputs. Unlike the machine learning algorithm you looked at previously, neural networks are a form of deep learning wherein you don't need to know an ideal algorithm for your problem set ahead of time. It can be used for many different scenarios and classification is one of them. For this section, you'll use the Keras library with TensorFlow to construct the neural network, and explore how it handles the Titanic dataset.

  1. The first step is to import the required libraries and to create the model. In this case, you'll use a Sequential neural network, which is a layered neural network wherein there are multiple layers that feed into each other in sequence.

  2. After defining the model, the next step is to add the layers of the neural network. For now, let's keep things simple and just use three layers. Add the following code to create the layers of the neural network.

    • The first layer will be set to have a dimension of 5, since you have 5 inputs: sex, pclass, age, relatives, and fare.
    • The last layer must output 1, since you want a 1-dimensional output indicating whether a passenger would survive.
    • The middle layer was kept at 5 for simplicity, although that value could have been different.

    The rectified linear unit (relu) activation function is used as a good general activation function for the first two layers, while the sigmoid activation function is required for the final layer as the output you want (of whether a passenger survives or not) needs to be scaled in the range of 0-1 (the probability of a passenger surviving).

    You can also look at the summary of the model you built with this line of code:

  3. Once the model is created, it needs to be compiled. As part of this, you need to define what type of optimizer will be used, how loss will be calculated, and what metric should be optimized for. Add the following code to build and train the model. You'll notice that after training the accuracy is ~80%.

    Note: This step may take anywhere from a few seconds to a few minutes to run depending on your machine.

  4. With the model built and trained its now time to see how it performs against the test data.

    Similar to the training, you'll notice that you were able to get close to 80% accuracy in predicting survival of passengers. This result was better than the 75% accuracy from the Naive Bayes Classifier tried previously.

Next steps

Now that you're familiar with the basics of performing machine learning within Visual Studio Code, here are some other Microsoft resources and tutorials to check out.

Vs Code For Python

  • Learn more about working with Jupyter Notebooks in Visual Studio Code (video).
  • Get started with Azure Machine Learning for VS Code to deploy and optimize your model using the power of Azure.
  • Find additional data to explore on Azure Open Data Sets.