How to contribute to scikit-image#
Developing Open Source is great fun! Join us on the scikit-image developer forum and tell us which of the following challenges you’d like to solve.
Mentoring is available for those new to scientific programming in Python.
If you’re looking for something to implement or to fix, you can browse the open issues on GitHub.
Here’s the long and short of it:
If you are a first-time contributor:
Go to scikit-image/scikit-image and click the “fork” button to create your own copy of the project.
Clone the project to your local computer:
git clone https://github.com/your-username/scikit-image.git
Change the directory:
Add the upstream repository:
git remote add upstream https://github.com/scikit-image/scikit-image.git
Now, you have remote repositories named:
upstream, which refers to the
origin, which refers to your personal fork
Next, you need to set up your build environment. Please refer to Build environment setup for instructions.
Finally, we recommend you use a pre-commit hook, which runs some auto-formatters when you do a
Although our code is hosted on github, our datasets are stored on gitlab and fetched with pooch. New datasets must be submitted on
gitlab. Once merged, the data registry
in the main Github repository can be updated.
Develop your contribution:
Pull the latest changes from upstream:
git checkout main git pull upstream main
Create a branch for the feature you want to work on. Use a sensible name, such as ‘transform-speedups’:
git checkout -b transform-speedups
Commit locally as you progress (with
git commit). Please write good commit messages.
To submit your contribution:
Push your changes back to your fork on GitHub:
git push origin transform-speedups
Enter your GitHub username and password (repeat contributors or advanced users can remove this step by connecting to GitHub with SSH).
Go to GitHub. The new branch will show up with a green “pull request” button – click it.
If you want, post on the developer forum to explain your changes or to ask for review.
Reviewers (the other developers and interested community members) will write inline and/or general comments on your pull request (PR) to help you improve its implementation, documentation, and style. Every single developer working on the project has their code reviewed, and we’ve come to see it as a friendly conversation from which we all learn and the overall code quality benefits. Therefore, please don’t let the review discourage you from contributing: its only aim is to improve the quality of the project, not to criticize (we are, after all, very grateful for the time you’re donating!).
To update your pull request, make your changes on your local repository and commit. As soon as those changes are pushed up (to the same branch as before) the pull request will update automatically.
Continuous integration (CI) services are triggered after each pull request submission to build the package, run unit tests, measure code coverage, and check the coding style (PEP8) of your branch. The tests must pass before your PR can be merged. If CI fails, you can find out why by clicking on the “failed” icon (red cross) and inspecting the build and test logs.
A pull request must be approved by two core team members before merging.
If your change introduces any API modifications, please update
If your change introduces a deprecation, add a reminder to
TODO.txtfor the team to remove the deprecated functionality in the future.
To reviewers: if it is not obvious from the PR description, add a short explanation of what a branch did to the merge message and, if closing a bug, also add “Closes #123” where 123 is the issue number.
If GitHub indicates that the branch of your PR can no longer be merged automatically, merge the main branch into yours:
git fetch upstream main git merge upstream/main
If any conflicts occur, they need to be fixed before continuing. See which files are in conflict using:
Which displays a message like:
Unmerged paths: (use "git add <file>..." to mark resolution) both modified: file_with_conflict.txt
Inside the conflicted file, you’ll find sections like these:
The way the text looks in your branch
Choose one version of the text that should be kept, and delete the rest:
The way the text looks in your branch
Now, add the fixed file:
git add file_with_conflict.txt
Once you’ve fixed all merge conflicts, do:
Advanced Git users are encouraged to rebase instead of merge, but we squash and merge most PRs either way.
All code should have tests (see test coverage below for more details).
All code should be documented, to the same standard as NumPy and SciPy.
For new functionality, always add an example to the gallery (see Sphinx-Gallery below for more details).
No changes are ever merged without review and approval by two core team members. There are two exceptions to this rule. First, pull requests which affect only the documentation require review and approval by only one core team member in most cases. If the maintainer feels the changes are large or likely to be controversial, two reviews should still be encouraged. The second case is that of minor fixes which restore CI to a working state, because these should be merged fairly quickly. Reach out on the developer forum if you get no response to your pull request. Never merge your own pull request.
Set up your editor to remove trailing whitespace. Follow PEP08.
Use numpy data types instead of strings (
Use the following import conventions:
import numpy as np import matplotlib.pyplot as plt from scipy import ndimage as ndi # only in Cython code cimport numpy as cnp cnp.import_array()
When documenting array parameters, use
image : (M, N) ndarrayand then refer to
Nin the docstring, if necessary.
Refer to array dimensions as (plane), row, column, not as x, y, z. See Coordinate conventions in the user guide for more information.
Functions should support all input image dtypes. Use utility functions such as
img_as_floatto help convert to an appropriate type. The output format can be whatever is most efficient. This allows us to string together several functions into a pipeline, e.g.:
Py_ssize_tas data type for all indexing, shape and size variables in C/C++ and Cython code.
Use relative module imports, i.e.
from .._shared import xyzrather than
from skimage._shared import xyz.
Wrap Cython code in a pure Python function, which defines the API. This improves compatibility with code introspection tools, which are often not aware of Cython code.
For Cython functions, release the GIL whenever possible, using
See the testing section of the Installation guide.
Tests for a module should ideally cover all code in that module, i.e., statement coverage should be at 100%.
To measure the test coverage, install
pip install -r requirements/test.txt) and then run:
$ spin coverage
This will print a report with one line for each file in
detailing the test coverage:
Name Stmts Exec Cover Missing ------------------------------------------------------------------------------ skimage/color/colorconv 77 77 100% skimage/filter/__init__ 1 1 100% ...
To build the HTML documentation, you can run:
Then, all the HTML files will be generated in
To rebuild a full clean documentation, run:
spin docs --clean
Sphinx and other python packages needed to build the documentation
can be installed using:
pip install -r requirements/docs.txt
If you are contributing an example to the gallery or editing an existing one,
build the docs (see above) and open a web browser to check how your edits
scikit-image/doc/build/html/auto_examples/: navigate to the file
you have added or changed.
When adding an example, visit also
scikit-image/doc/build/html/auto_examples/index.html to check how the new
thumbnail renders on the gallery’s homepage. To change the thumbnail image,
please refer to this section
of the Sphinx-Gallery docs.
Note that gallery examples should have a maximum figure width of 8 inches.
sudo apt-get install -qq texlive texlive-latex-extra dvipng
sudo tlmgr install ucs dvipng
“citation not found: R###” There is probably an underscore after a reference in the first line of a docstring (e.g. _). Use this method to find the source file: $ cd doc/build; grep -rin R####
“Duplicate citation R###, other instance in…”” There is probably a  without a  in one of the docstrings
Make sure to use pre-sphinxification paths to images (not the _images directory)
If the behavior of the library has to be changed, a deprecation cycle must be followed to warn users.
a deprecation cycle is not necessary when:
adding a new function, or
adding a new keyword argument to the end of a function signature, or
fixing what was buggy behavior
- a deprecation cycle is necessary for any breaking API change, meaning a
change where the function, invoked with the same arguments, would return a different result after the change. This includes:
changing the order of arguments or keyword arguments, or
adding arguments or keyword arguments to a function, or
changing a function’s name or submodule, or
changing the default value of a function’s arguments.
Usually, our policy is to put in place a deprecation cycle over two releases.
For the sake of illustration, we consider the modification of a default value in a function signature. In version N (therefore, next release will be N+1), we have
def a_function(image, rescale=True): out = do_something(image, rescale=rescale) return out
that has to be changed to
def a_function(image, rescale=None): if rescale is None: warn('The default value of rescale will change ' 'to `False` in version N+3.', stacklevel=2) rescale = True out = do_something(image, rescale=rescale) return out
and in version N+3
def a_function(image, rescale=False): out = do_something(image, rescale=rescale) return out
Here is the process for a 2-release deprecation cycle:
In the signature, set default to None, and modify the docstring to specify that it’s True.
In the function, _if_ rescale is set to None, set to True and warn that the default will change to False in version N+3.
doc/release/release_dev.rst, under deprecations, add “In a_function, the rescale argument will default to False in N+3.”
TODO.txt, create an item in the section related to version N+3 and write “change rescale default to False in a_function”.
Note that the 2-release deprecation cycle is not a strict rule and in some cases, the developers can agree on a different procedure upon justification (like when we can’t detect the change, or it involves moving or deleting an entire function for example).
Scikit-image uses warnings to highlight changes in its API so that users may
update their code accordingly. The
stacklevel argument sets the location in
the callstack where the warnings will point. In most cases, it is appropriate
to set the
2. When warnings originate from helper
routines internal to the scikit-image library, it is may be more appropriate to
3. For more information, see the documentation of
function in the Python standard library.
To test if your warning is being emitted correctly, try calling the function from an IPython console. It should point you to the console input itself instead of being emitted by the files in the scikit-image library.
ipython:1: UserWarning: ...
Please report bugs on GitHub.
While not mandatory for most pull requests, we ask that performance related PRs include a benchmark in order to clearly depict the use-case that is being optimized for. A historical view of our snapshots can be found on at the following website.
In this section we will review how to setup the benchmarks,
and three commands
spin asv -- dev,
spin asv -- run and
spin asv -- continuous.
Begin by installing airspeed velocity
in your development environment. Prior to installation, be sure to activate your
development environment, then if using
venv you may install the requirement with:
source skimage-dev/bin/activate pip install asv
If you are using conda, then the command:
conda activate skimage-dev conda install asv
is more appropriate. Once installed, it is useful to run the command:
spin asv -- machine
To let airspeed velocity know more information about your machine.
To write benchmark, add a file in the
benchmarks directory which contains a
a class with one
setup method and at least one method prefixed with
time_ method should only contain code you wish to benchmark.
Therefore it is useful to move everything that prepares the benchmark scenario
setup method. This function is called before calling a
method and its execution time is not factored into the benchmarks.
Take for example the
import numpy as np from skimage import transform class TransformSuite: """Benchmark for transform routines in scikit-image.""" def setup(self): self.image = np.zeros((2000, 2000)) idx = np.arange(500, 1500) self.image[idx[::-1], idx] = 255 self.image[idx, idx] = 255 def time_hough_line(self): result1, result2, result3 = transform.hough_line(self.image)
Here, the creation of the image is completed in the
setup method, and not
included in the reported time of the benchmark.
It is also possible to benchmark features such as peak memory usage. To learn more about the features of asv, please refer to the official airpseed velocity documentation.
Also, the benchmark files need to be importable when benchmarking old versions of scikit-image. So if anything from scikit-image is imported at the top level, it should be done as:
try: from skimage import metrics except ImportError: pass
The benchmarks themselves don’t need any guarding against missing features, only the top-level imports.
To allow tests of newer functions to be marked as “n/a” (not available) rather than “failed” for older versions, the setup method itself can raise a NotImplemented error. See the following example for the registration module:
try: from skimage import registration except ImportError: raise NotImplementedError("registration module not available")
Prior to running the true benchmark, it is often worthwhile to test that the code is free of typos. To do so, you may use the command:
spin asv -- dev -b TransformSuite
TransformSuite above will be run once in your current environment
to test that everything is in order.
The command above is fast, but doesn’t test the performance of the code
adequately. To do that you may want to run the benchmark in your current
environment to see the performance of your change as you are developing new
features. The command
asv run -E existing will specify that you wish to run
the benchmark in your existing environment. This will save a significant amount
of time since building scikit-image can be a time consuming task:
spin asv -- run -E existing -b TransformSuite
Often, the goal of a PR is to compare the results of the modifications in terms
speed to a snapshot of the code that is in the main branch of the
scikit-image repository. The command
asv continuous is of help here:
spin asv -- continuous main -b TransformSuite
This call will build out the environments specified in the
file and compare the performance of the benchmark between your current commit
and the code in the main branch.
The output may look something like:
$ spin asv -- continuous main -b TransformSuite · Creating environments · Discovering benchmarks ·· Uninstalling from conda-py3.7-cython-numpy1.15-scipy ·· Installing 544c0fe3 <benchmark_docs> into conda-py3.7-cython-numpy1.15-scipy. · Running 4 total benchmarks (2 commits * 2 environments * 1 benchmarks) [ 0.00%] · For scikit-image commit 37c764cb <benchmark_docs~1> (round 1/2): [...] [100.00%] ··· ...ansform.TransformSuite.time_hough_line 33.2±2ms BENCHMARKS NOT SIGNIFICANTLY CHANGED.
In this case, the differences between HEAD and main are not significant enough for airspeed velocity to report.
It is also possible to get a comparison of results for two specific revisions for which benchmark results have previously been run via the asv compare command:
spin asv -- compare v0.14.5 v0.17.2
Finally, one can also run ASV benchmarks only for a specific commit hash or
release tag by appending
^! to the commit or tag name. For example to run
the skimage.filter module benchmarks on release v0.17.2:
spin asv -- run -b Filter v0.17.2^!