A Beginner's Guide to Learning NumPy (updated for NumPy 2) (revised for NumPy 2)
DAT Linux
This book is available at https://leanpub.com/numpybyexample
This version was published on 2025-03-18
* * * * *
* * * * *
NumPy 2 is the biggest major update to NumPy in almost a decade. And whilst many of the updates in the new major and minor releases to date are related to re-organisation and efficiency improvements, there are also a considerable number of removals, deprecations, and additions. With this book being an introductory level guide, the impacts were manageable, and revisions were made to bring the text up to date where relevant; but also to bring focus to important changes such as the new numpy.strings
module - to which an additional chapter is dedicated.
2.2.3
, some code changes for suitability
numpy.strings
section
For a fuller insight into the Numpy 2.x changes, the reader is encouraged to at least look over the Highlights sections of each 2.x release, starting with the NumPy 2.0 release notes.
This book started out as a set of learning notes, which later turned into an (admittedly oversized) cheat-sheet of sorts. Converting those notes into a book format was a fairly natural step, as they were designed to introduce NumPy features in a structured way, with each chapter building on your knowledge incrementally.
You’ll soon discover that NumPy is about a few core concepts:
Arrays
Their types
The operations you perform on them.
The material is suitable for the NumPy beginner, although some basic programming knowledge — preferably Python — is assumed. It’s also expected that you have at least a rudimentary understanding of running shell commands from a terminal on your system of choice.
Many courses, especially in data science, have modules or bridging lessons for learning NumPy as a prerequisite. This book would certainly be suitable as an assigned text, or support material for this kind of introductory course. Chapters are concluded with a number of exercises (starting from chapter 3) to challenge the student. Solutions to the exercises are given in Appendix D. Solutions to Exercises.
The examples are concise (easy to reproduce by hand), designed to aid comprehension, and to capture the basics of a procedure or function that’s being introduced. We all have different learning styles, but with programming the goal is to write code, and code examples help express what a procedure does (programmers often say “the code is the ultimate documentation”). When you see the input, the code, and the output at once, then the purpose of a function is better revealed. “See also” sections will point the reader to functions that are related to the current topic, but not covered in detail.
We recommend reading the book from beginning to end for the first read. Even if you have some exposure to NumPy, you may still pick up some useful tidbits to fill the gaps in your knowledge.
As a note of caution, the code examples were written in an oversimplified way to assist learning, and should not be considered best practice in terms of code craft. Variables, for example, should always have meaningful names, and code generally adhere to a preferred style — or, as they call it in Python circles, “Pythonic”1. Some output has been formatted to improve readability, and may display a little differently to the format you see when running the example code yourself.
If you come across any issues with spelling, grammar, or code in the edition you’re reading, please send an email2. Errata will be added to a page on the companion website.
DAT Linux3 is a Linux distribution for data science. It brings together all your favourite open-source data science tools and apps into a ready-to-run desktop environment. It’s based on Ubuntu, so it’s easy to install and use. The custom DAT Linux Control Panel provides a centralised one-stop-shop for running and managing dozens of data science programs.
“Commitment is an act, not a word.”
- Jean-Paul Sartre
Learning something new, especially in the applied sciences, can take a great deal of commitment. The goal of this introductory chapter is to help you understand what NumPy is, why it has become so important for data computation, and why taking time to learn and practice it can be a rewarding endeavour.
NumPy is a programming language library for scientific computing. It provides support for working with small or large multidimensional arrays & matrices, along with a host of mathematical functions to operate on them efficiently.
NumPy’s versatility and utility make it essential for many applications where numerical computing is required. Its widespread adoption and integration with other libraries means it has become an indispensable software tool across many scientific disciplines.
NumPy is in use across a wide range of scientific fields, some of these include:
Field | Uses |
---|---|
Data Science & Analytics | Data manipulation, cleaning, transformation, and analysis. |
Machine Learning & Artificial Intelligence | Implementing algorithms for model training and inference. |
Scientific Computing | Solving equations, performing Fourier analysis, signal processing. |
Engineering | Simulation and modelling. |
Finance & Economics | Analysing financial data, modelling economic systems, algorithms for forecasting and risk management. |
Image & Signal Processing | Image manipulation, filtering, and analysis. |
Bio-informatics & Computational Biology | Analysing genomic data, modelling biological systems, algorithms for sequence analysis. |
Academic Research | Frameworks for analysing research data. |
NumPy integrates with, or serves as a foundation library upon which many other scientific libraries are built. Here are a few examples:
The central feature of Pandas is the tabular data-frame data structure. It adds functionality on top of NumPy arrays such as labelled indexing, time-series analysis, and data manipulation routines (querying, filtering, sub-setting).1
SciPy relies on NumPy to build functionality such as modules for optimisation, interpolation, integration, linear algebra, signal processing, and more.2
scikit-learn is a popular machine learning library. NumPy’s efficient array operations and memory management makes it well-suited for large-scale computations required in machine learning tasks such as regression and classification.3
PyArrow is the Python interface to the Apache Arrow columnar data format; a specialised data format for fast in-memory analytics. It integrates seamlessly with NumPy arrays.8
Statsmodels is a library for statistical modelling and hypothesis testing in Python. It relies on NumPy arrays for data representation and manipulation, allowing users to perform various statistical analyses such as regression, time series analysis, and hypothesis testing, using familiar NumPy syntax.9
OpenCV is a computer vision library that utilises NumPy arrays for representing and processing image and video data, implementing processing tasks such as filtering, transformation, and feature extraction.10
Learning NumPy — even if you rely more on higher level libraries in your daily work or projects — can still be highly beneficial for several reasons:
NumPy introduces fundamental concepts in numerical computing such as array and matrix manipulation, vectorised operations, and broadcasting. Understanding these concepts can help you make better choices when solving data manipulation problems.
Many concepts and techniques learned in NumPy are transferable to other libraries and languages. For example, understanding array operations in NumPy can make it easier to work with similar data structures in other languages like MATLAB or R.
NumPy serves as a foundation for more advanced topics in numerical computing such as linear algebra, Fourier analysis, optimisation, and signal processing. Knowledge of these foundational concepts can be valuable in various fields.
As we’ve seen, many other libraries and frameworks in the Python ecosystem build upon or interact with NumPy. Understanding NumPy can facilitate your use of these libraries.
Proficiency in NumPy is often a requirement or desirable skill for roles in data science, machine learning, scientific computing, and related fields. Familiarity with it can enhance your employ-ability.
NumPy has a large and active community, so there are ample resources available for learning and troubleshooting. You gain access to a wealth of tutorials, documentation, forums, and open-source projects to support learning and implementation.
Understanding NumPy provides a solid foundation for expanding into more advanced topics in data analysis and scientific computing, empowering data developers to tackle the complexity of data-driven applications.
Python itself does not have built-in support for arrays of more than basic utility (i.e. lists), which is where libraries like NumPy come in. The array
object provides some extra capability around strict typing (this is discussed in more detail in Appendix B - Python Lists & array
Vs NumPy Arrays) but they also fall seriously short of what NumPy can do.
Arguably, NumPy has become the de facto standard for array, matrix, and numerical computation in the scientific Python community.
Pandas. pydata.org. https://pandas.pydata.org/↩︎
SciPy. scipy.org. https://scipy.org/↩︎
Scikit-Learn. scikit-learn.org. https://scikit-learn.org/stable/↩︎
Matplotlib. matplotlib.org. https://matplotlib.org/↩︎
Seaborn. pydata.org. https://seaborn.pydata.org/↩︎
TensorFlow. tensorflow.org. https://www.tensorflow.org/↩︎
PyTorch. pytorch.org. https://pytorch.org/↩︎
PyArrow. scipy.org. https://arrow.apache.org/docs/python/↩︎
Statsmodels. statsmodels.org. https://www.statsmodels.org/stable/index.html↩︎
OpenCV. opencv.org. https://opencv.org/↩︎
This chapter will help you get started with using NumPy if you’re a relative beginner. You can skip any of these sections depending on how advanced you are with your system set-up.
Before you can use NumPy you need to have the Python (version 3) interpreter installed on your system. To check if you have Python installed, run the following command from a console:
python --version
You should see something like the following displayed:
If instead you see an error, then Python may not be installed. Sometimes the python
command is not mapped to python3
, so try again with the latter command instead.
Many Linux distributions have Python pre-installed via their package manager. If yours is a point or two behind the latest version, this is perfectly fine — a good distribution will ensure security patches are updated even if the latest versions of packages are not yet in a distribution’s “stable” branch.
Otherwise, the Python downloads1 web page is where you can find Python binaries for most operating systems. Choose the latest version that is available. Once installed, try the python --version
command again.
pip
The pip
command is used to install packages from the Python Package Index2. To ensure you have pip
installed, run the following from the command line:
pip --version
You should see output similar to this:
If you get an error then pip
may need to be installed manually. Please follow the instructions via the pip documentation web page3 and retry the above command.
NumPy installation is very simple, just run the following from the console and wait for the package manager to complete the install process:
pip install numpy
The package installation should not take very long, and it will automatically install any other libraries that NumPy depends on (that aren’t already installed in the Python environment). You can easily get a snapshot of all Python packages that are currently installed in this Python environment using the following command:
pip list
The package numpy
will be listed here.
Once you have Python and NumPy ready, you need a way to write and execute your code. This section will highlight three methods of achieving this.
Python is a scripting language, which means all you need is a script — a text file containing Python code — and a Python interpreter to execute the code. It will then output the results of your program, or any errors that are preventing the code from being executed correctly. There can be more than one script file for a program, allowing you to separate code into modules4. But for simplicity, begin with a single script file, name it anything you like, for example numpy_starter.py
, and write the following line of code:
print("Hello world!")
Save your file, and open a terminal console and change into the directory of the script. To execute, simply type in the following and press [Enter]:
python numpy_starter.py
You should see 'Hello world!'
echoed to the console.
There are many online resources for learning how to run Python scripts, including: How to Run Your Python Scripts and Code5.
IPython is a special shell environment for interacting with Python in a REPL* fashion. It’s simple to install using pip
:
pip install ipython
Once installed, run the command ipython
from your console and you’ll enter the IPython interactive environment. Here you can write Python code as you would in a script and execute that code with the [Enter] key. Here is an example interaction:
As you can see, once you run the code (it can be over more than one line) it presents the results, then prompts you for more input — and, handily, it remembers variables or functions declared during a session. To learn more about IPython, go to the IPython website6.
* REPL stands for Run-Eval-Print-Loop. See the REPL Wikipedia entry7.
Jupyter originated from IPython, expanding its capabilities within interactive scientific notebooks. The application is easily installed using pip
:
pip install jupyterlab
Jupyter runs as a web application on your local machine, and can be started with the following command:
jupyter lab
To run system commands from within an IPython shell, you need to begin the command with an !
:
!jupyter lab
Once the web application launches, you can either open an existing notebook (they have the .ipynb
file suffix), or create a new one — choosing the “Python kernel” option for your new notebook. You run code from within a notebook inside code “cells”, like so:
To learn more about installing and using Jupyter, visit the official Quick Start Guide8.
Importing the NumPy library as np
is considered a standard practice.
import numpy as np
The example code in this book was prepared and tested using NumPy version:
np.__version__
It’s easy to get in-line help information (often with example usage) on numpy objects such as classes, arrays, and functions:
np.info(np.add)
numpy.add
is one of many array computation functions that NumPy offers. This, and many other array operations will be highlighted throughout the book.
You can use Python’s built-in help()
function to search for documentation related to specific modules, classes, or functions. For example:
help('numpy.random')
help('numpy.random.normal')
If you’re looking for all the attributes and methods available in a module or object, dir()
is helpful:
import numpy as np
dir(np)
You can programmatically control the way arrays and numbers are formatted for display. Here’s an example output with default display (no explicit setting):
a = np.random.normal(0, 10, (3,4))
print(a)
Here’s the same array output with a (temporary) custom display setting:
with np.printoptions(precision=3):
print(a)
Python Releases. python.org. https://www.python.org/downloads↩︎
PyPI. pypi.org. https://pypi.org↩︎
Pip Installation. pip developers. https://pip.pypa.io/en/stable/installation/↩︎
Modules. python.org. https://docs.python.org/3/tutorial/modules.html↩︎
Run Python Scripts. Real Python. https://realpython.com/run-python-scripts↩︎
IPython — Interactive Computing. IPython. https://ipython.org↩︎
REPL. Wikipedia. https://en.wikipedia.org/wiki/Read-eval-print_loop↩︎
Jupyter Quick Start. Antonino Ingargiola; contributors. https://jupyter-notebook-beginner-guide.readthedocs.io/en/latest↩︎
The multidimensional array is the central data structure in NumPy, consisting of a fixed-size grid of elements of the same type. The elements of each dimension can be indexed by a tuple of positive integers starting at zero, Boolean indexes, or other arrays. Dimensions can be referenced by their axis, also indexed from zero. The NumPy array class is the ndarray
— of which all NumPy arrays are an instance.
The number of dimensions of an array indicates its “shape”. The shape of a NumPy array (ndarray.shape
) is made up of a tuple of integers — each value indicating the size along that dimension, in order of axis.
1-D array | 2-D array | 3-D array | N-D array* |
---|---|---|---|
![]() |
![]() |
![]() |
![]() |
Vector | Matrix | Cube | Tensor |
shape:(4,) |
shape:(2,4) |
shape:(3,4,2) |
shape:(a,b,c,d,..) |
* Whilst it can be difficult to visualise dimensions of 4-D or greater, multidimensional arrays (cubes1 or tensors2) are an essential data representation structure used in business intelligence, machine learning, and deep learning applications.
Elements are identified by order of axis, and position along that axis. Take the following 2-D array, a = np.array([[1, 2, 3], [4, 5, 6]])
:
[[1, 2, 3],
[4, 5, 6]]
The position of a specific element is represented as a tuple (x,y), best illustrated as follows:
[[(0,0), (0,1), (0,2)],
[(1,0), (1,1), (1,2)]]
Therefore the element with value 2
is referenced using a[(0,1)]
(1st row, 2nd column). Note the parentheses are optional, so arr[0,1]
also works.
NumPy comes with a larger set of available data types than standard Python. It defines a host of array-scalar types3, as well as aliases for some. There’s a built-in numpy.dtype
object associated with each of the array-scalar types. This table includes most of the NumPy data types that you’ll encounter or need.
Name (numpy.? ) |
Python type * | Alias ^ | Short code | |
---|---|---|---|---|
Signed | byte |
int8 |
b |
|
integer | short |
int16 |
h |
|
intc |
int32 |
i |
||
int_ , long |
int64 |
l |
||
longlong |
q |
|||
Unsigned | ubyte |
uint8 |
B |
|
integer | ushort |
uint16 |
H |
|
uintc |
uint32 |
I |
||
uint , ulong |
uint64 |
L |
||
ulonglong |
Q |
|||
Float | half |
float16 |
e |
|
single |
float32 |
f |
||
double * |
float |
float64 |
d |
|
longdouble |
float128 |
g |
||
Complex | csingle |
complex64 |
F |
|
cdouble * |
complex |
complex128 |
D |
|
clongdouble |
complex256 |
G |
||
String | bytes_ |
S |
||
str_ , string_ * |
string |
U , <U? ~ |
||
Other | bool , bool_ * |
bool |
? |
|
object_ |
O |
|||
datetime64 * |
datetime.datetime |
M |
||
timedelta64 * |
datetime.timedelta |
m |
||
void |
V |
^ Aliases displayed based on a Linux x86_64 system. An alias is referenced using np.<alias>
in the same way as numpy.<name>
; or declared as either dtype="<name>"
or dtype="<alias>"
during array creation.
* These NumPy types are directly compatible with (drop-in replacements for) the given Python type. For example dtype="double"
, or its alias dtype="float64"
, may instead be declared as dtype=float
.
~ When you specify a Unicode string type such as dtype="<U10"
, you’re setting a fixed size of 10 characters per string. There isn’t a strict upper limit on this number, but the size of the array (in terms of system resources) will impose practical limits.
dtype
creation with a NumPy object, and its string (alternative) variation:
np.dtype(np.int_)
np.dtype('long')
np.dtype('l')
dtype
creation using a sized alias:
np.dtype(np.int64)
np.dtype('int64')
np.dtype('i8')
dtype
creation using Python built-in aliases:
np.dtype(float)
np.dtype('float')
The following chapter on Array Creation will demonstrate how explicitly referencing or creating dtype
objects can be done during the array creation process.
These are special user-defined data types used with structured arrays. See 4.9 Structured Arrays in the next chapter for more on this topic. The following are a few examples of how they’re created.
dtype
creation, each field is assigned a name and type (like a key/value pair):
np.dtype([('name1', np.float64), ('name2', np.int32)])
dtype
creation, specify names and formats separately as lists:
np.dtype({'names': ['name1', 'name2'],
'formats': ['f', 'i']})
dtype
creation, specify formats as strings alone (where names not required):
np.dtype('f, i')
NumPy includes some useful predefined constants, including:
numpy.nan
numpy.inf
numpy.e
numpy.pi
numpy.newaxis
numpy.euler-gamma
Here’s the output for numpy.e
:
np.e
And for numpy.pi
:
np.pi
Function | Description |
---|---|
nan |
IEEE 754 floating point representation of Not a Number. |
inf |
IEEE 754 floating point representation of (positive) infinity. |
newaxis |
A convenient alias for None, useful for indexing arrays. |
euler_gamma |
gamma = 0.5772156649015328606065120900824024310421…. |
Create a new 64 byte dtype
of float type, using a sized alias, and assign it to the variable t
. Print out the variable.
Repeat the above dtype
creation, but instead using an equivalent native Python type.
Write a Python expression that calculates the area of a circle with radius of 30mm. (Result will be in units of mm2)
Convert the result of the area of the circle to units of cm2. Print the result.
NumPy arrays (of type numpy.ndarray
) can be easily constructed using the numpy
function array
, but there are other convenient ways of creating pre-populated arrays. The following sections outline a number of these approaches. Always consult a function’s documentation to gain a deeper understanding of its inputs, outputs, and limitations.
The simplest way to create a NumPy array is to pass a Python list to the array
function.
An array’s data type is determined by the provided list’s values (integers default to numpy.int64
):
a = np.array([1, 2, 3])
print(a)
Same as the previous example, but with a preassigned list:
d = [1, 2, 3]
a = np.array(d)
print(a)
Array values are implicitly promoted to the largest type (in this example cast to numpy.float32
):
# 2-D array..
a = np.array([(1.0, 2), (3, 4)])
print(a)
Explicitly force the type (values are promoted to numpy.float64
):
a = np.array([(1, 2, 3)], dtype='f')
print(a)
Explicitly force the type (demote all values to numpy.int32
):
a = np.array([(1.1, 2, 3)], dtype='int32')
print(a)
Assign a dtype
and pass it to array()
:
# Declaring a type once and re-using it is more
# efficient if done often..
t = np.dtype('float')
a = np.array([(1, 2), (3, 4)], dtype=t)
print(a)
If you want an array with a certain shape but need to defer filling it with values to a later time, then you can create an “empty” array.
Fill an array (of this shape) with meaningless values, requiring correct initialisation later:
a = np.empty([2, 3])
print(a)
You might expect an “empty” array to be filled with zeros or None
(Python’s version of Null) values. np.empty
should therefore be used with caution, as whilst the values it generates are meaningless, they may introduce invalid computation results in your code until populated with meaningful values.
You can fill a newly created array with specific preferred values based on your needs.
Just some zeros please:
a = np.zeros(4)
print(a)
Nothing but ones:
a = np.ones((2,2))
print(a)
Fill / pad a new larger array with a smaller list:
# First argument is shape, second contains the values
# to fill with..
a = np.full((2,2), (3,4))
print(a)
Creating new arrays filled with sequences is surprisingly convenient and powerful.
Get a spread of values from within the range {0 < 12} in increments of 2:
a = np.arange(0., 12, 2)
print(a)
Get four values equally spaced within the range {0 <= 6} (inclusive):
a = np.linspace(0, 6, 4)
print(a)
Get four values spaced evenly on a base 10 log scale between {0 <= 2}:
# Each random value ^10..
a = np.logspace(0, 2, 4)
print(a)
numpy.random
Use a seed to ensure the same values are reproduced, if required:
np.random.seed(42)
42 was chosen arbitrarily, but it can be any other number of your choosing.
Create a 2x1 array containing random float values in the range {0. <= 1}:
a = np.random.random((2,1))
print(a)
3x4 array of random int values in the range {–2 <= 10}:
a = np.random.randint(-2, 10, (3,4))
print(a)
Modify a sequence in-place by shuffling its contents (only shuffles the array along the first axis):
np.random.shuffle(a)
print(a)
Randomly permute a sequence:
a = [1, 2, 3, 4, 5, 6]
np.random.permutation(a)
print(a)
3x4 array of random, normally distributed values in the range {0. <= 10.}:
a = np.random.normal(0, 10, (3,4))
print(a)
3x4 array of random values in the standard normal distribution range:
a = np.random.standard_normal((3,4))
print(a)
Function * | Description |
---|---|
numpy.random.binomial |
Draw samples from a binomial distribution |
numpy.random.chisquare |
Draw samples from a chi-square distribution |
numpy.random.gamma |
Draw samples from a Gamma distribution |
numpy.random.uniform |
Draw samples from a uniform distribution |
* “See also” short descriptions and links, courtesy of numpy.org.
Array-like objects are data structures that may be used as inputs to a wide variety of NumPy array creation functions. This book will only touch on the most common ones. Consult a function’s documentation to understand its acceptable input types.
Common array-like objects include:
numpy.ndarray
)
1.
or [1, 2, 3]
)
(1, 2, 3)
)
'1;2;3'
)
These functions create copies of an array, therefore changes to the copy will not effect the original array. We’ll be using the array a
, a 3 x 4 matrix, based on the result of the last operation:
print(a)
Create array from an array-like object (creates a copy if a
is an ndarray
):
b = np.asarray(a)
print(b)
Array of ones with the same dimensions as the array-like input:
b = np.ones_like(a)
print(b)
Array of zeros with the same dimensions as the array-like input:
b = np.zeros_like(a)
print(b)
Array of preferred values with the same dimensions as the array-like input:
b = np.full_like(a, 2.)
print(b)
Empty (meaningless values) array with the same dimensions as array-like input:
b = np.empty_like(a)
print(b)
Cast to different type (creates a copy of the array):
b = a.astype(np.int_)
print(b)
Explicitly copy an array:
b = a.copy() # OR: b = np.copy(a)
print(b)
copy
also works with sub-setting:
b = a[1:].copy()
print(b)
Sub-setting is a feature of NumPy that allows you to extract precise subsets of an array. Chapter 7. Array Selection & Modification covers this topic in detail, along with other powerful ways to access array data.
Create an array from an array-like string:
s = '1,2,3.2'
a = np.fromstring(s, sep=',')
print(a)
NumPy offers a range of functions for creating matrices. Matrix operations will be covered in Chapter 8. Array Computation.
Identity matrix of dimension n x n:
a = np.identity(n=3, dtype=int)
print(a)
Matrix of suitable dimension given diagonal:
a = np.diag([1, 2, 3])
print(a)
Identity matrix of dimension N
xN
shifted by a diagonal offset (k
):
a = np.eye(N=3, k=1)
print(a)
Generate a Vandermonde matrix:
a = np.vander([1, 2, 3])
print(a)
Lower triangle of an array, with zeros above:
b = np.tril(a)
print(b)
Function | Description |
---|---|
numpy.asmatrix |
Interpret the array-like input as a matrix |
numpy.diagflat |
Create a 2-D array with the flattened input as a diagonal |
Structured data types were mentioned in the previous chapter, but here we show how an array can be created using them.
Structured arrays are useful for when you need to work with heterogeneous data:
# Compound type..
t = np.dtype([('name', 'U20'), ('age', int)])
a = np.array([('Alice', 25), ('Bob', 30)], dtype=t)
print(a)
print(a[0])
print(a['name'][1])
Structured arrays are a way of implementing mixed types in NumPy. They may affect the performance and utility of your array, so use them with caution.
Record arrays are a special kind of structured array that permits field access using the form x.y
. Unlike regular NumPy arrays (of type numpy.ndtype
) record arrays are instances of numpy.recarray
. Record arrays can be created explicitly, or converted from regular arrays.
Create a regular array with named compound type, and convert it to a record array:
a = np.array([(1, 2.), (3, 4.)],
dtype=[('x', '<i2'), ('y', '<f2')])
b = a.view(np.recarray)
b.y
Confirm this is a recarray
object:
type(b)
Create an uninitialised record array of shape (2,2) and initialise the array:
a = np.recarray((2,2),
dtype=[('x', '<i2'), ('y', '<f2')])
a.x = [1, 2.]
a.y = [3, 4.]
a
Notice that the data type is an instance of record
:
a.dtype
Create an array from a text or binary file:
# File f typically created with ndarray.tofile()..
a = np.fromfile(f)
Load an array from a text (commonly CSV) file:
a = np.loadtxt(f, delimiter=',')
Load an array from a Python pickle object:
a = np.load(p)
Load text into an array with tidying capabilities:
a = np.genfromtxt(a, ...)
Function | Description |
---|---|
numpy.asanyarray |
Convert input to an ndarray, except pass ndarray through |
numpy.bmat |
Build matrix object from string, nested sequence, or array |
numpy.ascontiguousarray |
Return a contiguous array (ndim >= 1) in memory |
numpy.choose |
Create an array, by ele index, from a choice of arrays |
numpy.fromfunction |
Construct an array by executing a function over each coordinate |
numpy.fromiter |
Create a new 1-dimensional array from an iterable object |
numpy.frombuffer |
Interpret a buffer as a 1-dimensional array |
numpy.fromregex |
Construct an array from a text file, using regular expression parsing |
numpy.geomspace |
Get numbers spaced evenly on a log scale (geometric progression) |
numpy.mat |
Interpret the input as a matrix (no copy) |
numpy.meshgrid |
Return a list of coordinate matrices from coordinate vectors |
numpy.tri |
An array with ones at and below the given diagonal and zeros elsewhere |
numpy.triu |
Upper triangle of an array (zeros below) |
Refer to Appendix C. NumPy Function & Property Reference (or the online documentation1) for a full list of array creation functions available in the numpy
module.
The following table shows data relating to three students’ test scores (0 — 100%) over four different tests.
Student No. | Test #1 | Test #2 | Test #3 | Test #4 |
---|---|---|---|---|
1 | 63.5 | 56. | 68 | 73.5 |
2 | 53 | 77.5 | 61 | 83 |
3 | 59 | 79 | 67.5 | 70 |
Create a 2-D Python list of the test scores, with the student scores as “rows” in test order. Assign this list to the variable student_scores_list
. Print out this list.
Using the list from Exercise 4-1, create a NumPy array assigned to student_scores_arr
, and explicitly assign it an appropriate floating point dtype
. Print out the array.
Create a copy of student_scores_arr
whilst assigning it to a new variable. Change the type of the copied array to a suitable integer. What do the scores look like now? What effect did the conversion have on the values?
Create a new array filled with ones, of the same dimensions as student_scores_arr
. Print the array. What is the dtype
of this array?
Create an identity matrix of 4x4 size, and print the result.
Design a suitable named compound type for student_scores_arr
(using sensible names without spaces), where the student id is an integer, and the scores are a floating point. Recreate the scores array to use this compound type. Print out the dtype
for the array. (Hint: use tuples as rows.)
Convert the structured array created in Exercise 4-6 to a recarray
array. Print out the 2nd field (column) of the record array by name.
NumPy Reference. numpy.org. https://numpy.org/doc/stable/reference↩︎
Inspecting a NumPy array involves examining its properties and attributes to gain a better understanding of its characteristics and contents. The following examples rely on the array defined here:
a = np.array([(1., 2.), (3., 4.)])
print(a)
Get the shape of the array — a tuple indicating the length of each dimension:
a.shape
The number of dimensions:
a.ndim
The total number of elements in the array:
a.size
Length of an element, in bytes:
# This is directly associated with the array's type..
a.itemsize
Information about the memory layout of the array:
a.flags
An array value will evaluate to True if it represents anything other than zero.
Test if all elements in an array evaluate to True:
np.all(a)
Test if any (at least a single) element evaluates to True:
np.any(a)
The array’s data type (dtype
):
a.dtype # OR: np.dtype(a)
Name of the array’s data type:
a.dtype.name
Character code of the data type:
a.dtype.char
Get the unique number for this data type:
a.dtype.num
Get a printable string of an array’s contents:
np.array2string(a)
Get a string of an array plus info about its type:
a = np.array([(1, 2), (3, 4)], np.int32)
np.array_repr(a)
Function | Description |
---|---|
numpy.dtype.byteorder |
A character indicating the byte-order of a dtype object |
numpy.dtype.fields |
Dictionary of names defined for this data type, or None |
numpy.dtype.flags |
Bit-flags describing how this data type is to be interpreted |
numpy.dtype.isbuiltin |
Integer indicating how this dtype relates to the built-in dtypes |
numpy.dtype.isnative |
Boolean if the byte order of dtype is native to the platform |
numpy.dtype.kind |
A character code (one of ‘biufcmMOSUV’) identifying the kind of data |
numpy.ndarray.strides |
Tuple of bytes to step in each dimension when traversing an array |
numpy.ndarray.nbytes |
Total bytes consumed by the elements of the array |
numpy.nonzero |
Return the indices of the elements that are non-zero |
These exercises refer to the following 2-D array:
a = np.array([(1, 2, 3), (4, 5, 6)])
What is the shape of the above NumPy array? Use the array’s shape
property to confirm your conclusion.
What is the size of each element in this array, in bytes?
Use the appropriate inspection property to find the total bytes consumed by the array. How does this compare to the multiple of the previous exercise’s result by the total count of elements?
What is the string name of the data type of this array?
If a third row containing the elements [7 8 9]
was added to the array, what would be the number of dimensions?
Input and output (I/O) operations in NumPy primarily involve fetching data from external sources into NumPy arrays, and saving NumPy arrays to external files. We’ve already seen input with numpy.fromfile()
, numpy.loadtxt()
, and numpy.load()
, but persisting to, and retrieving arrays from, various file formats is also possible and simple to do.
Persist a single array to a file in binary format:
a = np.array([(1., 2.), (3., 4.)])
np.save('file.npy', a)
Read the file back into an array variable:
b = np.load('file.npy')
print(b)
You can just as easily persist (and therefore retrieve) multiple arrays.
Persist multiple arrays to an archive file:
# 'a', 'b' are keys for retrieval (can be any valid
# python variable name)..
np.savez('file.npz', a=a, b=b)
Load the archive back into the array variables:
c = np.load('file.npz')
a = c['a']
b = c['b']
print(a)
print(b)
An alternative to savez
is savez_compressed
, which is used exactly the same way except with compression to reduce the size of the file. The data is uncompressed on load in the same way as before using np.load()
, and nothing special needs to be done. Be aware that extra compress or uncompress steps can affect program performance with larger files.
CSV (comma separated values) is a common format for storing tabular data1 in text files.
Here we see how simple it is to save an array to CSV:
np.savetxt('file.csv', a, delimiter=',')
The contents of the CSV file will look like this:
A CSV file very often comes with a header row that represent a name for each “column”, for example:
The skiprows
argument can be used to ignore the header when reading the file into an array:
a = np.loadtxt(f, skiprows=1)
print(a)
In Python, file paths can be specified as relative — to the ‘current’ directory, e.g.:
dir/file.npy
or absolute — from the ‘root’ directory, e.g.:
/home/dir/file.npy
Relative paths depend on where the application was launched (and you may not know this reliably). So if in doubt prefer absolute paths.
If you are developing a data solution that needs to run on multiple platforms, you should use the os.path.sep
constant to ensure file and directory paths are system-independent:
import os
path = 'dir' + os.path.sep + 'file.csv'
np.savetxt(path, a, delimiter=',')
Create a file called student_scores.csv and save it to your computer, adding to it the following contents:
Load the file into a NumPy array, excluding the header, and print the array.
Load the array instead using the compound type:
t = np.dtype([('student_no', 'int'),
('test_1', float),
('test_2', float),
('test_3', float)])
Print the new array.
Hint: Try using genfromtext
with the names=True
attribute.
Convert the array you created in Exercise 6-2 to a recarray
, and print out a list of the student numbers using the array.field notation.
CSV. Wikipedia. https://en.wikipedia.org/wiki/Comma-separated_values↩︎
Array selection in NumPy relates to the activity of locating and extracting specific elements, or sub-sets, from an array. NumPy has flexible options for selecting array data based on the principle of indexing (using bracket notation []
) that can hold: scalars; tuples; slices; or Boolean expressions, used to identify and locate elements as sub-sets of interest.
Modifying values goes hand in hand with sub-setting, however slicing creates a view that shares memory with the original array where modifications to the sub-set will change the original array. On the other hand simple indexing returns a scalar value, whilst fancy- or boolean- expression indexing creates a copy of the data — and changes to the new array are therefore not volatile to the original array. To explicitly copy a sub-set, use the ndarray.copy()
function.
Indexing on 1-D arrays (vectors) is similar to indexing with Python lists. The following examples use the array a = np.array([1, 2, 3, 4])
, depicted as:
1
2
3
4
Remember that Python indexing is zero-based, so the index at value 1
is [0]
, value 2
is [1]
, and so on. Let’s look at some common techniques for indexing simple NumPy arrays:
Method | Example- | Sub-set | Comment | |
---|---|---|---|---|
a[m] * | 7.1 | a[0] |
1 _ _ _ |
First element of the vector array |
7.2 | a[2] |
_ _ 3 _ |
Third element | |
7.3 | a[-1] |
_ _ _ 4 |
Last element | |
a[m:n] ^ | 7.4 | a[0:3] |
1 2 3 _ |
From index 0 < index 3 |
7.5 | a[1:-2] |
_ 2 _ _ |
From index 1 < index at len-2 | |
a[:] | 7.6 | a[:] |
1 2 3 4 |
Select all elements |
a[m:] | 7.7 | a[2:] |
_ _ 3 4 |
From index 2 up to last element |
a[:n] | 7.8 | a[:2] |
1 2 _ _ |
From index 0 up to < index 2 |
a[m:n:p] | 7.9 | a[0:3:2] |
1 _ 3 _ |
Using from:to:step-by notation |
7.10 | a[::-1] |
4 3 2 1 |
Reverse the array |
* The first method uses simple indexing, but the rest are slicing operations — which create a volatile “view” of the underlying array.
With slicing operations the left index is inclusive but the right index is exclusive. So a[0:3]
reads as: get the sub-set starting at the value at index 0
, up to but not including the value at index 3
. If it helps understand it better, think of m:n as where to put a cursor — you put the first cursor to the left of index m, and the second cursor to the left of index n. In the case of a[0:3]
, like so: |0 1 2 |3. The indices between the cursors define the sub-set of interest.
Array modification is also possible using indexing or slicing, but the data being targeted must match the dimensions of the slice, or be a scalar. (This is an early peak into broadcasting, see the section Chapter 8. Array Computation for more on this topic).
The following examples show the array a = np.array([1, 2, 3, 4])
being progressively modified:
Method | Example- | Result | Comment | |
---|---|---|---|---|
a[m] = p | 7.11 | a[1] = 9 |
1 9 3 4 |
Second element modified |
a[m:n] = [q,r,..] | 7.12 | a[0:2] = [7, 8] |
7 8 3 4 |
Lengths must match |
a[m:n] = p | 7.13 | a[:] = 9 |
9 9 9 9 |
Replace all with scalar |
7.14 | a[2:] = 8 |
9 9 8 8 |
Replace from idx 2 to last | |
a[m:n:p] = [q,r..] | 7.15 | a[::2] = [1, 2] |
1 9 2 8 |
Step-wise modification |
7.16 | a[:2] = [1, 2, 3] |
ValueError! |
Lengths incompatible |
Recall that common indexing or slicing creates a “view” of the underlying array, meaning changes to the sub-set will affect the original array it was sliced from. As a demonstration:
a = np.array([1, 2, 3, 4])
b = a[0:2]
b[1] = 9
print(b)
print(a)
2-D (matrices) or n-D (multidimensional) array selection permits indexing and slicing in a similar way, using a tuple of expressions applied to each axis. The following examples use the array:
a 2x3 matrix depicted as:
1
2
3
4
5
6
Select all rows, 2nd column:
a[:,1]
Select 2nd row, all columns:
a[1,:]
Select all rows, every 2nd column:
a[:,::2]
Single element (2nd row, 3rd column):
a[1,2] # Or a[(1,2)] — optional braces
Modify the last column:
a[:,2] = [8,9]
Fancy indexing allows you to access and manipulate specific elements or slices of an array with the use of arrays, lists, expressions, or Boolean lists to locate target elements. Fancy indexing creates a copy of the data, so there’s no danger of modifying the original array.
The following examples use the arrays:
1
2
3
4
and:
1
2
a = np.array([1, 2, 3, 4])
b = np.array([1, 2])
Pass a list of indexes to match:
a[[0, 1, 2]]
Pass an array of indexes to match:
a[b]
The following examples use the array:
a = np.array([[1, 2, 3],
[4, 5, 6]])
Depicted as:
1
2
3
4
5
6
Mixed mode (slicing & fancy indexing):
# All rows, and these columns..
a[:,[1,2]]
Recall that fancy indexing creates a copy of the array, meaning changes to the sub-set won’t affect the original array. As a demonstration:
b = a[:,[1,2]]
b[0,1] = 9
print(b) # Value at [0,1] = 9
print(a) # Value at [0,1] unchanged
Boolean indexing with NumPy allows you to select elements from an array based on a condition, targeting the array’s values at indexes that meet the condition (is True). Consider the following examples, where:
a = np.array([1., 2., 3., 4.])
b = (a < 2)
First, let’s see what b
evaluates to:
print(b)
b
is assigned the result of an element-wise conditional evaluation; in this case returning True
where each value, v
, in a
, meets the condition v < 2
, otherwise returning False
. So to proceed, we have the variables a
and b
to work with:
a =
1
2
3
4
— a NumPy array.
b =
True
False
False
False
— a simple Python list.
Explicit boolean list selection:
a[[True, False, False, True]]
Variable boolean list selection:
a[b]
Negated boolean list selection:
# Negation converts all False values to True, and True
# to False..
a[~b]
Combining arithmetic with a boolean expression:
# For each value in a, True if odd number else False
a[a%2 == 1]
The exercises will refer to the following NumPy array:
student_scores_list = [
[1, 63.5, 56.0, 68.0, 73.5],
[2, 53.0, 77.5, 61.0, 83.0],
[3, 59.0, 79.0, 67.5, 70.0]
]
scores_array = np.array(student_scores_list)
Note: answers to array selection questions will be affected by any previous question’s modifications.
Get a list of (only) the scores of the 1st student.
Print out student IDs of the 1st two students.
Modify the score of the 2nd student in the second test to 87.
Print the scores of all students in the 3rd test.
Modify the scores of all students in the 4th test to 75.
Print the scores of the 2nd student in the last two tests.
Modify the scores of all students in the 1st test to 70.
Print the ID and scores of the last student.
Repeat the selection from Example 7-8 but assign it to the variable sub_arr
. Change the first score of sub_arr
to 77. Is this sub-selection a view? Confirm by printing both sub_arr
and scores_array
to see if the original array has also been modified.
Retrieve an array of the scores only (no student id) and apply a conditional expression to return True|False for any scores over 80.
Array computation in NumPy is about performing efficient and versatile mathematical operations and data manipulations on multidimensional arrays. These include arithmetic, logical, matrix, set, and statistical operations. Ufuncs (universal functions) allow you to perform operations in an element-wise fashion, while broadcasting allows arrays of different shapes to be combined and operated on by automatically adjusting the dimensions of a smaller array to match that of a larger one.
Where required, the examples below use the array:
a = np.array([1, 2, 3])
Element-wise absolute value for int or float:
Functions | Example | Result |
abs , fabs |
np.abs(np.array([-1, 2, -3.1])) |
[1, 2, 3] |
Ceiling or floor of each element:
Functions | Example | Result |
ceil , floor |
np.floor(np.array([-1.3, 2, 3.8])) |
[-2., 2., 3.] |
Round to nearest int or decimal:
Functions | Example | Result |
rint , round |
np.rint(np.array([-1.3, 2, 3.8])) |
[-1., 2., 4.] |
Calculate the square root or square:
Functions | Example | Result |
sqrt , square |
np.sqrt(np.array([9, 16, 4])) |
[3., 4., 2.] |
Exponentiation — (ex, 2x, 1/x):
Functions | Example | Result |
exp , exp2 , reciprocal |
np.exp(np.array([1, 2])) |
[2.71828183, 7.3890561] |
Natural log, base 10 log, base 2, log(1+x):
Functions | Example | Result |
log , log10 , log2 , log1p |
np.log10(np.array([100, 1000])) |
[2., 3.] |
Get the sign of each number (1 or -1):
Functions | Example | Result |
sign |
np.sign(np.array([-2, 2.5])) |
[-1., 1.] |
Split an array into [fraction, integral] parts:
Functions | Example | Result |
modf |
np.modf(np.array([1, -2.1])) |
[ 0. , -0.1], [ 1., -2.] |
Test for NaN, +/-infinity, finiteness:
Functions | Example | Result |
isnan , isinf , isfinite |
np.modf(np.array([1, -2.1])) |
[ 0. , -0.1], [ 1., -2.] |
Trigonometric functions and their hyperbolic and inverse relations:
Functions | Example | Result |
cos , sin , tan |
np.sin(np.array([0, 1])) |
[0., 0.84147098] |
cosh , sinh , tanh |
np.sinh(np.array([0, 1])) |
[0., 1.17520119] |
arccos , arccosh , arcsin |
np.arcsin(np.array([0, 1])) |
[0., 1.57079633] |
arcsinh , arctan , arctanh |
np.arcsinh(np.array([0, 1])) |
[0., 0.88137359] |
Evaluate this if condition met, otherwise that:
Functions | Example | Result |
where |
np.where(a < 3, a , a * 3) |
[1, 2, 9] |
Get the truth values of a negation:
Functions | Example | Result |
logical_not |
np.logical_not(a<2) |
[False, True, True] |
~ (operator) |
~(a<2) |
[False, True, True] |
Real or imaginary component of imaginary numbers:
Functions | Example | Result |
real |
np.real(np.array([1+2j, 3+4j])) |
[1., 3.] |
imaginary |
np.imag(np.array([1+2j, 3+4j])) |
[2., 4.] |
Basic math and logic functions have corresponding operators that can be used interchangeably, whilst math operators can also be used in the operator-assignment style (i.e. +=
). Where required, the following examples use the arrays:
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
c = a.copy()
In the following tables, some functions display the function name paired with its equivalent math operator.
Common arithmetic functions and their operator shortcuts:
Functions | Example | Result |
add (+ ), subtract (- ) |
np.add(a, b) |
[5, 7, 9] |
multiply (* ), divide (/ ) |
c *= b |
[4, 10, 18] |
floor_divide (// ) |
np.floor_divide(a, b) , or (a // b ) |
[0, 0, 0] |
remainder , modulus |
np.remainder(b, a) |
[0, 1, 0] |
divmod |
np.divmod(b, a) |
[4, 2, 2], [0, 1, 0] |
power (** ) |
np.power(a, b) , or a ** b |
[1, 32, 729] |
minimum , maximum |
np.maximum(a, b) |
[4, 5, 6] |
copysign |
np.copysign(a, [-4, 5, 6]) |
[-1., 2., 3.] |
Element-wise equality tests:
Functions | Example | Result |
greater (> ), greater_equal (>= ) |
np.greater(b, a) |
[True, True, True] |
less (< ), less_equal (<= ) |
np.less(b, a) |
[False, False, False] |
equal (== ), not_equal (!= ) |
np.equal(a, [1, 4, 5]) |
[True, False, False] |
Broadcasting allows arrays with different shapes to be combined in operations, but restricted to arrays with compatible dimensions – the axes of the trailing dimensions are equal or either array has a dimension of 1.
Scalar operations with 1-D arrays:
Where: a = np.array([1, 2, 3])
Example | Broadcast (intermediate step) | Result |
a * 2 |
a * [2,2,2] |
[2,4,6] |
Scalar operations with 2-D arrays:
Where: a = np.array([[1,2,3], [4,5,6]])
Example | Broadcast (intermediate step) | Result |
a * 2 |
a * [[2,2,2], [2,2,2]] |
[[2,4,6], [8,10,12]] |
Operations between 1-D and 2-D arrays:
Where: a = np.array([[1,2,3], [4,5,6]])
Where: b = np.array([[2.], [3.]])
Example | Broadcast (intermediate step) | Result |
np.multiply(a, b) |
a * [[2.,2.,2.], [3.,3.,3.]] |
[[2.,4.,6.], [12.,15.,18.]] |
Perform a ‘dot’ product on two matrices:
a = np.array([[1, 2, 3], [4, 5, 6]]) # 2x3 matrix
b = np.array([1, 2, 3]) # 3x1 matrix
np.dot(a, b)
With matrix multiplication (where at least one array is 2-D), the number of columns of the first matrix must match the number of rows of the second matrix (where A is an m
x n
matrix, and B is n
x p
). The resulting matrix is of dimension m
x p
. Here is the above example in matrix notation:
The dot
product will behave differently if one side is a scalar. Let’s compare a scalar vs a 1x1 array dot product with a 2-D array:
# A 2 x 3 matrix:
a = np.array([[1, 2, 3], [4, 5, 6]])
# A scalar:
b = 2
# A 1 x 1 matrix:
c = np.array([2])
# Applies element-wise multiplication:
np.dot(a, b)
The dot
operation falls back to a “scalar * array” operation. Whereas an explicit 1x1 array fails, as it will attempt to perform the “matrix dot
matrix” product operation with incompatible dimensions:
np.dot(a, c) # Matrix dimensions incompatible!
Function | Description |
---|---|
cross(a,b) |
Return the cross product of two (arrays of) vectors |
inner(a,b) |
Ordinary inner product 1-D arrays or sum product over the last axes |
kron(a,b) |
Kronecker product of two arrays |
outer(a,b) |
Compute the outer product of two vectors |
tensordot(a,b) |
Compute tensor dot product along specified axes |
A set, by definition, is an unordered collection of unique objects. Set functions can be unary (e.g. unique
) or binary (e.g. union1d
).
Where: a = np.array([[1,2,2], [4,4,6]])
, get the unique values:
Example | Result |
np.unique(a) |
[[1,2,4,6]] |
Find the union of two arrays:
Example | Result |
np.union([-1,2,3], [1,3,5]) |
[[-1,1,2,3,5]] |
Test whether each element of a 1-D array is also present in a second array:
Example | Result |
np.isin([-1,2,3], [1,3,5]) |
[False, False, True] |
Function | Description |
---|---|
intersect1d(a,b) |
Find the intersection of two arrays |
isin(a,b) |
Boolean array of size a, where elements of a exist in b |
setdiff1d(a,b) |
Find the set difference of two arrays |
setxor1d(a,b) |
Find the set exclusive-or of two arrays |
Logic tests. Where a = np.array([[1, 2], [3, 4]])
:
Function | Example | Result |
array_equal(a,b) |
np.array_equal([1,1],[1]) |
False |
array_equiv(a,b) |
np.array_equiv([1,1],[1]) |
True |
select(a,b) |
np.select([a>1], [a*2], 99) |
[[99, 4], [6, 8]] |
Function | Description |
---|---|
all(a) |
Test whether all elements along a given axis evaluate to True |
allclose(a,b) |
Test if two arrays are element-wise equal within a tolerance |
any(a) |
Test whether any array element along a given axis evaluates to True |
nonzero(a) |
Return the indices of the elements that are non-zero |
Statistical functions are typically unary, but can be made to operate along individual axes.
Stats operations. Where: a = np.array([[1, 2], [3, 4]])
:
Function | Example | Result |
mean |
np.mean(a) |
2.5 |
max |
np.max(a, axis=1) |
[2, 4] |
cumsum |
np.cumsum(a) |
[ 1, 3, 6, 10] |
Function | Description |
---|---|
argmax |
Returns the indices of the maximum values along an axis |
argmin |
Returns the indices of the minimum values along an axis |
cumprod |
Return the cumulative product of elements along a given axis |
min |
Return the minimum of an array or minimum along an axis |
std |
Compute the standard deviation along the specified axis |
sum |
Sum of array elements over a given axis |
var |
Compute the variance along the specified axis |
The first five exercises will refer to the following NumPy arrays:
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
Perform element-wise addition (+) and multiplication (*) of the two arrays and print the results.
Does the matrix dot
operation between the arrays result in a valid array? Print the result. Can you explain how the result was generated?
Perform element-wise comparison (greater than, less than, equal to) between the elements of these arrays. Print the result of each comparison.
Perform scalar division on the array a
with the value 5. Print the resulting array. What is the dtype
of the result?
Create a larger NumPy array b
with dimensions (3, 3) containing random values. Then, add array a
to b
. Print the resulting array.
Given the following NumPy array:
student_scores_list = [
[1, 63.5, 56.0, 68.0, 73.5],
[2, 53.0, 77.5, 61.0, 83.0],
[3, 59.0, 79.0, 67.5, 70.0]
]
scores_array = np.array(student_scores_list)
What percentage of student scores are over 80?
Hint: think about summation of a Boolean array from the result of an element-wise math expression (True evaluates to 1, False to 0).
Compute the mean and standard deviation of the NumPy array arr
:
arr = np.array([1, 2, 3, 4, 5])
Compute and print out the median and quartiles (25th and 75th percentiles) of the following NumPy array arr
:
arr = np.array([10, 20, 30, 40, 50])
Compute and print out the correlation coefficient between the two NumPy arrays x
and y
:
x = np.array([1.1, 2.3, 3.3, 4.1, 5.6])
y = np.array([5.8, 4, 3.4, 2.1, 1.05])
Would you say these arrays are strongly correlated, weakly correlated, or not correlated? If correlated, in what direction?
NumPy arrays can be transformed in many ways, generally these include transposing, reshaping, combining, splitting, rotating, and sorting. The following examples use the following array, a 2x3 matrix:
a = np.array([[1, 2, 3], [4, 5, 6]])
Depicted as:
1
2
3
4
5
6
The transpose of a 1-D array is unchanged, otherwise swap rows and columns.
Transpose a 2-D array:
np.transpose(a)
Transposing returns a view of the array. a.swapaxes(0,1)
achieves the same result.
Arrays can be reshaped to a desired (but compatible) shape.
Reshape a 2x3 array to 3x2:
np.reshape(a, (3,2)) # Or: a.reshape(3,2)
New and old shapes must be compatible. Eg a.reshape(2,2)
would result in an error. But np.reshape(a, (1,6))
yields: [1, 2, 3, 4, 5, 6]
.
Flattening reduces the dimension of an array.
Flatten an array into a vector:
np.ravel(a) # Creates a view
Flattening with order:
np.ravel(a, order='F') # Creates a view
See the API docs for various Flip and reverse ordering options.
Rotations can be performed at the element level or upon an axis.
Rotate by flip & reverse:
np.flip(a)
Rotate by flipping on axis:
np.flip(a, 0)
Flip and rotate operations behave as you’d expect, with variations depending on optional arguments. Note flip(a,0)
~= flipud(a)
and flip(a,1)
~= fliplr(a)
.
Rotate an array by 90 degrees in the plane specified by axes:
a = np.array([[1, 2, 3], [4, 5, 6]])
np.rot90(a, 1)
Function | Description |
---|---|
roll |
Roll array elements along a given axis. |
The following examples will make use of the arrays:
a = np.array([[1, 2, 3], [4, 5, 6]])
b = np.array([[7, 8, 9]])
Depicted as:
1
2
3
4
5
6
and
7
8
9
Join arrays in sequence along an axis:
np.concatenate((a, b)) # Default axis is 0
Arrays must have the same shape, except in the dimension corresponding to the given (or default) axis.
Combine & flatten into a vector:
np.concatenate((a, b), axis=None)
The concatenate
function creates a copy of the data.
Split into multiple arrays:
The following example will make use of the 1D array, a = np.array([1, 2, 3, 4, 5, 6])
:
1
2
3
4
5
6
np.split(a, 2) # Returns an array of arrays
Function | Description |
---|---|
hsplit |
Split an array into multiple sub-arrays horizontally (column-wise) |
dsplit |
Split array into multiple sub-arrays along the 3rd axis (depth) |
vsplit |
Split an array into multiple sub-arrays vertically (row-wise) |
The following examples will make use of the arrays:
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])`
Depicted as:
1
2
3
and:
4
5
6
Stacking arrays by row:
np.stack((a, b))
This operation is equivalent to np.vstack((a, b))
Stacking arrays by column:
np.stack((a, b), axis=-1)
Function | Description |
---|---|
hstack |
Stack arrays in sequence horizontally (column wise) |
dstack |
Stack arrays in sequence depth wise (along third axis) |
vstack |
Stack arrays in sequence vertically (row wise) |
The following examples use the array:
a = np.array([[4, 2, 1], [3, 6, 5]])
Depicted as:
4
2
1
3
6
5
Sort by ‘row’:
np.sort(a)
Sort by ‘row’, but output the index:
np.argsort(a)
Flatten, then sort:
np.sort(a, axis=None)
Sort by ‘column’:
np.sort(a, axis=0) # Or.. a.sort()
Function | Description |
---|---|
argsort |
Returns the indices that would sort an array |
Convert the array a = np.array([[1, 2], [3, 4]])
:
1
2
3
4
So that it looks like this:
1
3
2
4
Is it possible to reshape the 2x2 array a = np.array([[1, 2], [3, 4]])
into a 4x1 array? If so what is the procedure? Assign this new array to variable b
.
Flatten the arrays a
and b
from exercise 9-2, and combine them into a single 2x4 array assigned to c
, resulting in:
array([[1, 2, 3, 4],
[1, 2, 3, 4]])
Use a rotate operation upon c
to convert it into the 2x4 array:
array([[4, 4],
[3, 3],
[2, 2],
[1, 1]])
Convert the 1-D array a = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9])
into a 3x3 array sorted in reverse order, so that the result is:
array([[9, 8, 7],
[6, 5, 4],
[3, 2, 1]])
Arrays made up of string values can be created, copied, and otherwise manipulated just like any NumPy array. However, given that the values are non-numeric, many, if not most of the numeric operations we’ve seen throughout this book will not be applicable.
The new numpy.strings
module (as of NumPy 2.0) provides a set of functions and utilities for the manipulation of string-based NumPy arrays. It’s designed to handle string data efficiently, offering vectorised operations that can be applied to entire arrays of strings. There are operations for string splitting, case conversion, substring searching, and more - optimised for performance on large datasets compared to standard Python string methods.
String-based array processing in NumPy can be useful for:
The following examples rely on the array, a
, defined as follows:
import numpy as np
a = np.array(["Apple", "Banana", "Cherry", "date", "",
"42"], dtype="str_")
String arrays can be tested for any; if at least one element evaluates to True:
np.any(a)
String arrays can also be tested for all; if every element evaluates to True:
np.all(a)
String arrays can be easily converted to lower or upper-case, or even capitalised:
np.strings.lower(a)
np.strings.upper(a)
np.strings.capitalize(a)
String array values may be self-concatenated to produce repeating values:
np.strings.multiply(a, 3)
Replacing text inside strings is straighforward:
np.strings.replace(a, 'Ba', 'XE')
Compare the equality of string arrays to return an array of True
| False
values:
b = np.array(["Apple", "Orange", "Cherry", "date", "",
"97"], dtype="str_")
np.strings.equal(a, b)
Find the occurances of a string in the array:
np.strings.find(a, "ang")
Check whether any values are numeric (the entire value represents a number):
np.strings.isnumeric(a)
Test whether any value starts with, or ends with a sub-string:
np.strings.startswith(a, "Ap")
np.strings.endswith(a, "e")
Function | Description |
---|---|
strip |
Remove leading and trailing characters (defaults to whitespace) |
zfill |
Return a numeric string left-filled with zeros |
not_equal |
Return (x1 != x2) element-wise |
greater |
Return the truth value of (x1 > x2) element-wise |
less |
Return the truth value of (x1 < x2) element-wise |
isspace |
Return the truth value if there are only whitespace characters |
isalpha |
Return the truth value if there are only alphabeitc characters |
str_len |
Returns the length of each element |
swapcase |
Uppercase characters converted to lowercase and vice versa |
Swap the case of the following array so that uppercase letters are converted to lowercase, and lower to upper:
['Hello', 'WORLD', 'FROM', 'Python!']
Pad the following array of numeric strings with zeros, up to a width of 4:
['42', '97', '2005', '0025']
Test whether the values in the following array consist only of alphabetic characters:
['Los', 'Angeles', 'Year 2019']
Find the length of each value in the previous array.
When you install a lot of software packages over time, your system can become bloated and can run the risk of programs failing. This can be due to conflicting package versions, or packages that were removed inadvertently that a program might rely on. Or maybe you’ve tried to run different versions of the same application (with potentially conflicting dependencies) for testing.
A solution to address these problems is to create virtual environments – isolated, self-contained ‘sand-boxes’ that are perfect for protecting your applications from potential dependency and usage conflicts.
virtualenv
virtualenv
One of the most popular programs for creating virtual environments is virtualenv, itself a package that you install via PyPI:
pip install virtualenv
Once installed you’re ready to create virtual environments. It’s a good idea to have a top-level folder under which all your virtual environments reside. From now on we’ll refer to a virtual environment simply as a “venv”.
To create a venv, change into the directory of the parent location for your venvs. Existing venvs will reside here as sub-directories. To create a venv run the following command:
python -m venv my-venv
The name of the venv can be anything you like, but of course uniquely named under this location. To enter a venv you need to ‘activate’ it, for example:
source my-venv/bin/activate
You can do this from anywhere by passing the fully qualified path to the venv folder:
source /path/to/my-venv/bin/activate
In windows this might look like this (a .
(dot) operator is equivalent to source
):
. C:\path\to\my-venv\bin\activate
When you enter the venv, you’ll notice a special prompt that informs you that you’re inside a venv. For example:
Once you’re in the venv, you can execute python scripts or install packages as you would normally.
python --version
pip install numpy jupyterlab
(Output not shown)
Packages will be installed for this venv without knowledge of other venvs. You can then run your programs as you would in a global environment:
jupyter lab
# [Ctrl+c] to stop Jupyter..
To exit out of a venv, shut down any running programs and type:
deactivate
and you’ll be returned to a regular console prompt.
There’s a lot more to virtual environments in Python (including the ability to enter/execute/exit them via shell scripts), so consult the online documentation or follow a good tutorial.
Docker is a modern, small footprint alternative to hardware virtualisation1. Instead of running an entire guest operating system as a virtual machine (composed of the full complement of an operating system’s disk, memory, and processing designated in advance), Docker containers are light-weight system shells that install the bare minimum that’s required to run an application.
A container then relies on the host’s infrastructure for access to the operating system kernel and interfaces. This means you can run multiple Docker containers on commodity hardware. Containers are sand-boxed environments that isolate your apps, inside what looks like — to them — a stand alone operating system, of which there are many variants, most commonly Linux based.
Docker containers can be created from scratch, but there are also images available that you can point to that are already set up with all or most of an application’s needs. There’s even a Jupyter image which you can use to launch a Jupyter-ready container after Docker is installed2.
The easiest way to install and get Docker running is to use Docker Desktop3. Once you have it installed, start Docker Desktop, and this will also start the Docker engine. You can find and install images via the “Images — Hub” area in the Docker Desktop application, otherwise it’s also very easy to do via a command line:
docker pull quay.io/jupyter/scipy-notebook
Once this completes, you can run Jupyter as follows:
docker run -p 10000:8888 quay.io/jupyter/scipy-notebook
This will launch Jupyter in a web browser at the localhost:8888/
URL, and you can start working with Python notebooks.
array
Vs NumPy ArraysPython lists are a native language feature that let you easily create array-like data sequences, using bracket ([]
) indexing to access the data. Lists are also expandable — you can add and remove items at any time. Whilst flexible, they have limitations, and the array
object is available to use if lists don’t meet your engineering goals. For advanced data processing, however, the NumPy array (ndarray
) may be the superior choice.
When you think of arrays, you typically expect them to have the following characteristics:
Be of a fixed size
Be of the same type
Have efficient indexing
Ability to efficiently operate on them
Multidimensional structure.
We’ll discuss lists, the array
module, and NumPy arrays in context of these features, and highlight how they differ and when you might prefer one over the others.
Python lists are simple to create, for example:
items = [1, 2.5, 'N/A', [555, 'FILK']]
items
This is a perfectly valid Python list, but you can immediately see that a list can be made up of any type. Lists, therefore, put the responsibility on the engineer to enforce the rules of typing. And whilst it’s perfectly acceptable to stick to a convention, the lack of language-level enforcement can lead to problems simply because nothing prevents the list from being created with the wrong types in the first place.
Lists are also expandable, and therefore don’t have a fixed size:
items.pop(3)
items
So far we’ve broken the first two “rules” of the array test. But what about indexing? Lists are easily indexed, which meets goal 3. But without enforcement, you may not know what type you’re getting for certain. On top of that, in order to operate on a list in an element-wise fashion, you need to iterate over the entire list, for example:
for i in items:
print(i)
You can be a little more expressive with list comprehensions:
[i for i in items]
But at the end of the day, it’s just a more concise way of looping over a list.
Furthermore, if you want to operate on a list you need to be sure to handle potential problems with types, as they may not be compatible with the operation, for example:
[i * 2 for i in items]
This example ostensibly performs a multiplication operation on the data, and this time gets a result without failing; but the resulting value may vary unexpectedly depending on an element’s type (*
is an overloaded operator and works with strings differently to numbers) — or it may fail entirely.
Given that lists can be made up of other lists, you can design them to have a multidimensional structure:
And this looks a lot like a NumPy array — in fact you could use this array-like object to create a NumPy array, as we’ve seen before in this book. But relying on lists for n-D arrays adds to the complexity of enforcing types and managing dimension sizes. They soon become inefficient (multiple loops and complex logic) and unreliable.
array
ArrayThe array
module ships with Python as a core library feature, so you don’t have to install it using pip
, but you do need to import it:
import array
To create an array
you are required to specify the type, and there are thirteen available type codes defined. Here are a few of them:
'b'
— a char
'i'
— a signed int
of 2 bytes
'l'
— a signed long
of 4 bytes
'f'
— a float
of 4 bytes
'd'
— a double
of 8 bytes
You can also run array.typecodes
to get a quick print out of all codes:
array.typecodes
Creating an array is done as follows (first argument is the type code):
a = array('i', [1, 2, 3, 4, 5])
a
You have overcome the problem of loose typing with array
arrays, and have at least created a structure of a certain size up front. But the array
is not designed to be multidimensional — unless you create a list of array
arrays:
b = array('i', [1, 2, 3])
c = array('f', [2., 3., 4.])
a = [b, c];
a
Overcoming one problem, however, just re-introduces the problem that you had to begin with — lists. And you now have two different structures to deal with when processing.
The entire book has been dedicated to explaining the importance and use of ndarray
arrays in NumPy, so there’s no need to re-state the case at any length. And given the limitations of lists or array
arrays, it should now be clearer if and when you might decide to upgrade from one approach to the next.
If you need a simple data structure that’s easy to create, is mutable, that you can take responsibility over typing, and you won’t be performing very complex operations on, then stick with lists. Lists are highly inter-operable, and are easy to work with especially along with list comprehensions or generators.
An array
is something of an improvement on lists — you could call them a wrapper of simple lists that enforces type checking. Another reason to prefer an array
over a list is when you might need to persist simple data structures to file storage. It would be prudent to have type safety in this case, especially if the stored data will be shared among software components.
For everything else, use NumPy. Or, find a library such as SciPy or Pandas that builds on NumPy to provide the more specialised capabilities you’re after.
Member information listed here (grouped by modules) was extracted from the NumPy source code ‘docstring’, as of NumPy version: 2.2.3
. Some modules have been excluded, for example numpy.ctypeslib
and numpy.testing
. numpy.matrix
is no longer recommended and stands to be deprecated.
Listings of other sub-classes such as numpy.chararray
and numpy.bmat
were also excluded for brevity. You can see a full schedule of classes at the NumPy documentation web page1.
numpy
This section lists only the immediate members of this top-level NumPy class. Relevant sub-classes are listed separately.
numpy
A — FMember | Description |
---|---|
absolute |
Calculate the absolute value element-wise. |
add |
Add arguments element-wise. |
add_docstring |
Add a docstring to a built-in obj if possible. |
add_newdoc |
Add documentation to an existing object, typically one defined in C |
add_newdoc_ufunc |
Replace the docstring for a ufunc with new_docstring. |
all |
Test whether all array elements along a given axis evaluate to True. |
allclose |
Returns True if two arrays are element-wise equal within a tolerance. |
alltrue |
Check if all elements of input array are true. |
amax |
Return the maximum of an array or maximum along an axis. |
amin |
Return the minimum of an array or minimum along an axis. |
angle |
Return the angle of the complex argument. |
any |
Test whether any array element along a given axis evaluates to True. |
append |
Append values to the end of an array. |
apply_along_axis |
Apply a function to 1-D slices along the given axis. |
apply_over_axes |
Apply a function repeatedly over multiple axes. |
arange |
Return evenly spaced values within a given interval. |
arccos |
Trigonometric inverse cosine, element-wise. |
arccosh |
Inverse hyperbolic cosine, element-wise. |
arcsin |
Inverse sine, element-wise. |
arcsinh |
Inverse hyperbolic sine element-wise. |
arctan |
Trigonometric inverse tangent, element-wise. |
arctan2 |
Element-wise arc tangent of ‘x1/x2’ choosing the quadrant correctly. |
arctanh |
Inverse hyperbolic tangent element-wise. |
argmax |
Returns the indices of the maximum values along an axis. |
argmin |
Returns the indices of the minimum values along an axis. |
argpartition |
Perform an indirect partition along the given axis using the algorithm specified by the ‘kind’ keyword. |
argsort |
Returns the indices that would sort an array. |
argwhere |
Find the indices of array elements that are non-zero, grouped by element. |
around |
Round an array to the given number of decimals. |
array |
Create an array. |
array2string |
Return a string representation of an array. |
array_equal |
True if two arrays have the same shape and elements, False otherwise. |
array_equiv |
Returns True if input arrays are shape consistent and all elements equal. |
astype |
Copies an array to a specified data type.. |
atleast_2d |
View inputs as arrays with at least two dimensions. |
atleast_3d |
View inputs as arrays with at least three dimensions. |
average |
Compute the weighted average along the specified axis. |
bartlett |
Return the Bartlett window. |
base_repr |
Return a string representation of a number in the given base system. |
binary_repr |
Return the binary representation of the input number as a string. |
bincount |
Count number of occurrences of each value in array of non-negative ints. |
bitwise_and |
Compute the bit-wise AND of two arrays element-wise. |
bitwise_count |
Computes the number of 1-bits in the absolute value of x. |
bitwise_not |
Compute bit-wise inversion, or bit-wise NOT, element-wise. |
bitwise_or |
Compute the bit-wise OR of two arrays element-wise. |
bitwise_xor |
Compute the bit-wise XOR of two arrays element-wise. |
blackman |
Return the Blackman window. |
block |
Assemble an nd-array from nested lists of blocks. |
bmat |
Build a matrix object from a string, nested sequence, or array. |
bool_ |
Boolean type (True or False), stored as a byte. |
broadcast |
Produce an object that mimics broadcasting. |
broadcast_arrays |
Broadcast any number of arrays against each other. |
broadcast_shapes |
Broadcast the input shapes into a single shape. |
broadcast_to |
Broadcast an array to a new shape. |
busday_count |
Counts the number of valid days between ‘begindates’ and ‘enddates’, not including the day of ‘enddates’. |
busday_offset |
First adjusts the date to fall on a valid day according to the ‘roll’ rule, then applies offsets to the given dates. |
busdaycalendar |
A business day calendar object that efficiently stores information |
byte |
Signed integer type, compatible with C ‘char’. |
byte_bounds |
Returns pointers to the end-points of an array. |
bytes_ |
A byte string. |
c_ |
Translates slice objects to concatenation along the second axis. |
can_cast |
Returns True if cast between data types can occur according to the casting rule. |
cbrt |
Return the cube-root of an array, element-wise. |
cdouble |
Complex number type composed of two double-precision floating-point |
ceil |
Return the ceiling of the input, element-wise. |
char |
This module contains a set of functions for vectorized string operations and methods. |
character |
Abstract base class of all character string scalar types. (To be deprecated) |
choose |
Construct an array from an index array and a list of arrays to choose from. |
clip |
Given an interval, values outside the interval are clipped to the interval edges. |
clongdouble |
Complex number type composed of two extended-precision floating-point numbers. |
column_stack |
Stack 1-D arrays as columns into a 2-D array. |
common_type |
Return a scalar type which is common to the input arrays. |
compat |
|
complex128 |
Complex number type composed of two double-precision floating-point numbers, compatible with Python ‘complex’. |
complex256 |
Complex number type composed of two extended-precision floating-point numbers. |
complex64 |
Complex number type composed of two single-precision floating-point numbers. |
complexfloating |
Abstract base class of all complex number scalar types that are made up of floating-point numbers. |
compress |
Return selected slices of an array along given axis. |
concatenate |
Join a sequence of arrays along an existing axis. |
conj |
Return the complex conjugate, element-wise. |
conjugate |
Return the complex conjugate, element-wise. |
convolve |
Returns the discrete, linear convolution of two one-dimensional sequences. |
copy |
Return an array copy of the given object. |
copysign |
Change the sign of x1 to that of x2, element-wise. |
copyto |
Copies values from one array to another, broadcasting as necessary. |
core |
Contains the core of NumPy: ndarray, ufuncs, dtypes, etc. |
corrcoef |
Return Pearson product-moment correlation coefficients. |
correlate |
Cross-correlation of two 1-dimensional sequences. |
cos |
Cosine element-wise. |
cosh |
Hyperbolic cosine, element-wise. |
count_nonzero |
Counts the number of non-zero values in the array ‘a’. |
cov |
Estimate a covariance matrix, given data and weights. |
cross |
Return the cross product of two (arrays of) vectors. |
csingle |
Complex number type composed of two single-precision floating-point numbers. |
cumprod |
Return the cumulative product of elements along a given axis. |
cumulative_prod |
Compatible alternatives for cumprod. |
cumproduct |
Return the cumulative product over the given axis. |
cumsum |
Return the cumulative sum of the elements along a given axis. |
cumulative_sum |
Compatible alternatives for cumsum. |
datetime64 |
If created from a 64-bit integer, it represents an offset from ‘1970-01-01T00:00:00’. |
datetime_as_string |
Convert an array of datetimes into an array of strings. |
datetime_data |
Get information about the step size of a date or time type. |
deg2rad |
Convert angles from degrees to radians. |
degrees |
Convert angles from radians to degrees. |
delete |
Return a new array with sub-arrays along an axis deleted. |
diag |
Extract a diagonal or construct a diagonal array. |
diag_indices |
Return the indices to access the main diagonal of an array. |
diag_indices_from |
Return the indices to access the main diagonal of an n-dimensional array. |
diagflat |
Create a two-dimensional array with the flattened input as a diagonal. |
diagonal |
Return specified diagonals. |
diff |
Calculate the n-th discrete difference along the given axis. |
digitize |
Return the indices of the bins to which each value in input array belongs. |
disp |
|
divide |
Divide arguments element-wise. |
divmod |
Return element-wise quotient and remainder simultaneously. |
dot |
Dot product of two arrays. Specifically, |
double |
Double-precision floating-point number type, compatible with Python ‘float’ and C ‘double’. |
dsplit |
Split array into multiple sub-arrays along the 3rd axis (depth). |
dstack |
Stack arrays in sequence depth wise (along third axis). |
e |
Convert a string or number to a floating point number, if possible. |
ediff1d |
The differences between consecutive elements of an array. |
einsum |
Evaluates the Einstein summation convention on the operands. |
einsum_path |
Evaluates the lowest cost contraction order for an einsum expression by considering the creation of intermediate arrays. |
emath |
Wrapper functions to more user-friendly calling of certain math functions. |
empty |
Return a new array of given shape and type, without initializing entries. |
empty_like |
Return a new array with the same shape and type as a given array. |
equal |
Return (x1 == x2) element-wise. |
errstate |
Context manager for floating-point error handling. |
euler_gamma |
Convert a string or number to a floating point number, if possible. |
exp |
Calculate the exponential of all elements in the input array. |
exp2 |
Calculate ‘2^p’ for all ‘p’ in the input array. |
expand_dims |
Insert a new axis that will appear at the ‘axis’ position in the expanded array shape. |
expm1 |
Calculate ‘exp(x) — 1’ for all elements in the array. |
extract |
Return the elements of an array that satisfy some condition. |
eye |
Return a 2-D array with ones on the diagonal and zeros elsewhere. |
fabs |
Compute the absolute values element-wise. |
fill_diagonal |
Fill the main diagonal of the given array of any dimensionality. |
finfo |
Machine limits for floating point types. |
fix |
Round to nearest integer towards zero. |
flatiter |
Flat iterator object to iterate over arrays. |
flatnonzero |
Return indices that are non-zero in the flattened version of a. |
flexible |
Abstract base class of all scalar types without predefined length. |
flip |
Reverse the order of elements in an array along the given axis. |
fliplr |
Reverse the order of elements along axis 1 (left/right). |
flipud |
Reverse the order of elements along axis 0 (up/down). |
float128 |
Extended-precision floating-point number type, compatible with C ‘long double’ but not necessarily with IEEE 754 quadruple-precision. |
float16 |
Half-precision floating-point number type. |
float32 |
Single-precision floating-point number type, compatible with C ‘float’. |
float64 |
Double-precision floating-point number type, compatible with Python ‘float’ and C ‘double’. |
float_power |
First array elements raised to powers from second array, element-wise. |
floating |
Abstract base class of all floating-point scalar types. |
floor |
Return the floor of the input, element-wise. |
floor_divide |
Return the largest integer smaller or equal to the division of the inputs. |
fmax |
Element-wise maximum of array elements. |
fmin |
Element-wise minimum of array elements. |
fmod |
Returns the element-wise remainder of division. |
format_float_positional |
Format a floating-point scalar as a decimal string in positional notation. |
format_float_scientific |
Format a floating-point scalar as a decimal string in scientific notation. |
format_parser |
Class to convert formats, names, titles description to a dtype. |
frexp |
Decompose the elements of x into mantissa and twos exponent. |
frombuffer |
Interpret a buffer as a 1-dimensional array. |
fromfile |
Construct an array from data in a text or binary file. |
fromfunction |
Construct an array by executing a function over each coordinate. |
fromiter |
Create a new 1-dimensional array from an iterable object. |
frompyfunc |
Takes an arbitrary Python function and returns a NumPy ufunc. |
fromregex |
Construct an array from a text file, using regular expression parsing. |
fromstring |
A new 1-D array initialized from text data in a string. |
full |
Return a new array of given shape and type, filled with ‘fill_value’. |
full_like |
Return a full array with the same shape and type as a given array. |
numpy
G — OMember | Description |
---|---|
gcd |
Returns the greatest common divisor of ‘x1’ and ‘x2’ |
generic |
Base class for numpy scalar types. |
genfromtxt |
Load data from a text file, with missing values handled as specified. |
geomspace |
Return numbers spaced evenly on a log scale (a geometric progression). |
get_array_wrap |
|
get_include |
Return the directory that contains the NumPy *.h header files. |
get_printoptions |
Return the current print options. |
getbufsize |
Return the size of the buffer used in ufuncs. |
geterr |
Get the current way of handling floating-point errors. |
geterrcall |
Return the current callback function used on floating-point errors. |
gradient |
Return the gradient of an N-dimensional array. |
greater |
Return the truth value of `` element-wise. |
greater_equal |
Return the truth value of `` element-wise. |
half |
Half-precision floating-point number type. |
hamming |
Return the Hamming window. |
hanning |
Return the Hanning window. |
heaviside |
Compute the Heaviside step function. |
histogram |
Compute the histogram of a dataset. |
histogram2d |
Compute the bi-dimensional histogram of two data samples. |
histogram_bin_edges |
Function to calculate only the edges of the bins used by the ‘histogram’ function. |
histogramdd |
Compute the multidimensional histogram of some data. |
hsplit |
Split an array into multiple sub-arrays horizontally (column-wise). |
hstack |
Stack arrays in sequence horizontally (column wise). |
hypot |
Given the “legs” of a right triangle, return its hypotenuse. |
i0 |
Modified Bessel function of the first kind, order 0. |
identity |
Return the identity array. |
iinfo |
Machine limits for integer types. |
imag |
Return the imaginary part of the complex argument. |
in1d |
|
index_exp |
A nicer way to build up index tuples for arrays. |
indices |
Return an array representing the indices of a grid. |
inexact |
Abstract base class of all numeric scalar types with a (potentially) inexact representation of the values in its range. |
inf |
Convert a string or number to a floating point number, if possible. |
info |
Get help information for an array, function, class, or module. |
inner |
Inner product of two arrays. |
insert |
Insert values along the given axis before the given indices. |
int16 |
Signed integer type, compatible with C ‘short’. |
int32 |
Signed integer type, compatible with C ‘int’. |
int64 |
Signed integer type, compatible with Python ‘int’ and C ‘long’. |
int8 |
Signed integer type, compatible with C ‘char’. |
int_ |
Signed integer type, compatible with Python ‘int’ and C ‘long’. |
intc |
Signed integer type, compatible with C ‘int’. |
integer |
Abstract base class of all integer scalar types. |
interp |
One-dimensional linear interpolation for monotonically increasing sample points. |
intersect1d |
Find the intersection of two arrays. |
intp |
Signed integer type, compatible with Python ‘int’ and C ‘long’. |
invert |
Compute bit-wise inversion, or bit-wise NOT, element-wise. |
is_busday |
Calculates which of the given dates are valid days, and which are not. |
isclose |
Returns a boolean array where two arrays are element-wise equal within a tolerance. |
iscomplex |
Returns a bool array, where True if input element is complex. |
iscomplexobj |
Check for a complex type or an array of complex numbers. |
isdtype |
Determine if a provided dtype is of a specified data type kind. |
isfinite |
Test element-wise for finiteness (not infinity and not Not a Number). |
isfortran |
Check if the array is Fortran contiguous but not C contiguous. |
isin |
Calculates ‘element in test_elements’, broadcasting over ‘element’ only. |
isinf |
Test element-wise for positive or negative infinity. |
isnan |
Test element-wise for NaN and return result as a boolean array. |
isnat |
Test element-wise for NaT (not a time) and return result as a boolean array. |
isneginf |
Test element-wise for negative infinity, return result as bool array. |
isposinf |
Test element-wise for positive infinity, return result as bool array. |
isreal |
Returns a bool array, where True if input element is real. |
isrealobj |
Return True if x is a not complex type or an array of complex numbers. |
isscalar |
Returns True if the type of ‘element’ is a scalar type. |
issubdtype |
Returns True if first argument is a typecode lower/equal in type hierarchy. |
issubsctype |
Determine if the first argument is a subclass of the second argument. |
iterable |
Check whether or not an object can be iterated over. |
ix_ |
Construct an open mesh from multiple sequences. |
kaiser |
Return the Kaiser window. |
kernel_version |
Built-in immutable sequence. |
kron |
Kronecker product of two arrays. |
lcm |
Returns the lowest common multiple of ‘x1’ and ‘x2’ |
ldexp |
Returns x1 * 2^x2, element-wise. |
left_shift |
Shift the bits of an integer to the left. |
less |
Return the truth value of `` element-wise. |
less_equal |
Return the truth value of `` element-wise. |
lexsort |
Perform an indirect stable sort using a sequence of keys. |
lib |
Note: almost all functions in the ‘numpy.lib’ namespace |
linspace |
Return evenly spaced numbers over a specified interval. |
little_endian |
bool(x) -> bool |
load |
Load arrays or pickled objects from ‘.npy’, ‘.npz’ or pickled files. |
loadtxt |
Load data from a text file. |
log |
Natural logarithm, element-wise. |
log10 |
Return the base 10 logarithm of the input array, element-wise. |
log1p |
Return the natural logarithm of one plus the input array, element-wise. |
log2 |
Base-2 logarithm of ‘x’. |
logaddexp |
Logarithm of the sum of exponentiations of the inputs. |
logaddexp2 |
Logarithm of the sum of exponentiations of the inputs in base-2. |
logical_and |
Compute the truth value of x1 AND x2 element-wise. |
logical_not |
Compute the truth value of NOT x element-wise. |
logical_or |
Compute the truth value of x1 OR x2 element-wise. |
logical_xor |
Compute the truth value of x1 XOR x2, element-wise. |
logspace |
Return numbers spaced evenly on a log scale. |
longdouble |
Extended-precision floating-point number type, compatible with C ‘long double’ but not necessarily with IEEE 754 quadruple-precision. |
longlong |
Signed integer type, compatible with C ‘long long’. |
lookfor |
Do a keyword search on docstrings. |
mask_indices |
Return the indices to access (n, n) arrays, given a masking function. |
math |
This module provides access to the mathematical functions defined by the C standard. |
matrix_transpose |
Transposes a matrix (or a stack of matrices) x. |
matmul |
Matrix product of two arrays. |
matvec |
Matrix-vector dot product of two arrays. |
max |
Return the maximum of an array or maximum along an axis. |
maximum |
Element-wise maximum of array elements. |
may_share_memory |
Determine if two arrays might share memory |
mean |
Compute the arithmetic mean along the specified axis. |
median |
Compute the median along the specified axis. |
memmap |
Create a memory-map to an array stored in a binary file on disk. |
meshgrid |
Return a list of coordinate matrices from coordinate vectors. |
mgrid |
An instance which returns a dense multi-dimensional “meshgrid”. |
min |
Parameters |
min_scalar_type |
For scalar ‘a’, returns the data type with the smallest size and smallest scalar kind which can hold its value. |
minimum |
Element-wise minimum of array elements. |
mintypecode |
Return the character for the minimum-size type to which given types can be safely cast. |
mod |
Returns the element-wise remainder of division. |
modf |
Return the fractional and integral parts of an array, element-wise. |
moveaxis |
Other axes remain in their original order. |
multiply |
Multiply arguments element-wise. |
nan |
Convert a string or number to a floating point number, if possible. |
nan_to_num |
Replace NaN with zero and infinity with large finite numbers (default behaviour). |
nanargmax |
Return the indices of the maximum values in the specified axis ignoring NaNs. |
nanargmin |
Return the indices of the minimum values in the specified axis ignoring NaNs. |
nancumprod |
Return the cumulative product of array elements over a given axis treating Not a Numbers (NaNs) as one. |
nancumsum |
Return the cumulative sum of array elements over a given axis treating Not a Numbers (NaNs) as zero. |
nanmax |
Return the maximum of an array or maximum along an axis, ignoring any NaNs. |
nanmean |
Compute the arithmetic mean along the specified axis, ignoring NaNs. |
nanmedian |
Compute the median along the specified axis, while ignoring NaNs. |
nanmin |
Return minimum of an array or minimum along an axis, ignoring any NaNs. |
nanpercentile |
Compute the qth percentile of the data along the specified axis, while ignoring nan values. |
nanprod |
Return the product of array elements over a given axis treating Not a Numbers (NaNs) as ones. |
nanquantile |
Compute the qth quantile of the data along the specified axis, while ignoring nan values. |
nanstd |
Compute the standard deviation along the specified axis, while ignoring NaNs. |
nansum |
Return the sum of array elements over a given axis treating Not a Numbers (NaNs) as zero. |
nanvar |
Compute the variance along the specified axis, while ignoring NaNs. |
ndenumerate |
Multidimensional index iterator. |
ndim |
Return the number of dimensions of an array. |
ndindex |
An N-dimensional iterator object to index arrays. |
nditer |
Efficient multi-dimensional iterator object to iterate over arrays. |
negative |
Numerical negative, element-wise. |
nested_iters |
Create nditers for use in nested loops |
nextafter |
Return the next floating-point value after x1 towards x2, element-wise. |
nonzero |
Return the indices of the elements that are non-zero. |
not_equal |
Return (x1 != x2) element-wise. |
numarray |
Help for removed not found. |
number |
Abstract base class of all numeric scalar types. |
object_ |
Any Python object. |
ogrid |
An instance which returns an open multi-dimensional “meshgrid”. |
oldnumeric |
Help for removed not found. |
ones |
Return a new array of given shape and type, filled with ones. |
ones_like |
Return array of ones with the same shape and type as given array. |
outer |
Compute the outer product of two vectors. |
numpy
P — ZMember | Description |
---|---|
packbits |
Packs the elements of a binary-valued array into bits in a uint8 array. |
pad |
Pad an array. |
partition |
Return a partitioned copy of an array. |
percentile |
Compute the q-th percentile of the data along the specified axis. |
pi |
Convert a string or number to a floating point number, if possible. |
piecewise |
Evaluate a piecewise-defined function. |
place |
Change elements of an array based on conditional and input values. |
poly |
Find the coefficients of a polynomial with the given sequence of roots. |
poly1d |
A one-dimensional polynomial class. |
polyadd |
Find the sum of two polynomials. |
polyder |
Return the derivative of the specified order of a polynomial. |
polydiv |
Returns the quotient and remainder of polynomial division. |
polyfit |
Least squares polynomial fit. |
polyint |
Return an antiderivative (indefinite integral) of a polynomial. |
polymul |
Find the product of two polynomials. |
polynomial |
A sub-package for efficiently dealing with polynomials. |
polysub |
Difference (subtraction) of two polynomials. |
polyval |
Evaluate a polynomial at specific values. |
positive |
Numerical positive, element-wise. |
power |
First array elements raised to powers from second array, element-wise. |
printoptions |
Context manager for setting print options. |
prod |
Return the product of array elements over a given axis. |
product |
Return the product of array elements over a given axis. |
promote_types |
Returns the data type with the smallest size and smallest scalar kind to which both ‘type1’ and ‘type2’ may be safely cast. |
ptp |
Range of values (maximum — minimum) along an axis. |
put |
Replaces specified elements of an array with given values. |
put_along_axis |
Put values into the destination array by matching 1d index and data slices. |
putmask |
Changes elements of an array based on conditional and input values. |
quantile |
Compute the q-th quantile of the data along the specified axis. |
r_ |
Translates slice objects to concatenation along the first axis. |
rad2deg |
Convert angles from radians to degrees. |
radians |
Convert angles from degrees to radians. |
ravel |
Return a contiguous flattened array. |
ravel_multi_index |
Converts a tuple of index arrays into an array of flat indices, applying boundary modes to the multi-index. |
real |
Return the real part of the complex argument. |
real_if_close |
If input is complex with all imaginary parts close to zero, return real parts. |
rec |
Record Arrays |
recarray |
Construct an ndarray that allows field access using attributes. |
recfromtxt |
|
reciprocal |
Return the reciprocal of the argument, element-wise. |
record |
A data-type scalar that allows field access as attribute lookup. |
remainder |
Returns the element-wise remainder of division. |
repeat |
Repeat each element of an array after themselves |
require |
Return an ndarray of the provided type that satisfies requirements. |
reshape |
Gives a new shape to an array without changing its data. |
resize |
Return a new array with the specified shape. |
result_type |
Returns the type that results from applying the NumPy |
right_shift |
Shift the bits of an integer to the right. |
rint |
Round elements of the array to the nearest integer. |
roll |
Roll array elements along a given axis. |
rollaxis |
Roll the specified axis backwards, until it lies in a given position. |
roots |
Return the roots of a polynomial with coefficients given in p. |
rot90 |
Rotate an array by 90 degrees in the plane specified by axes. |
round |
Evenly round to the given number of decimals. |
row_stack |
|
s_ |
A nicer way to build up index tuples for arrays. |
save |
Save an array to a binary file in NumPy ‘.npy’ format. |
savetxt |
Save an array to a text file. |
savez |
Save several arrays into a single file in uncompressed ‘.npz’ format. |
savez_compressed |
Save several arrays into a single file in compressed ‘.nfastputmaskpz’ format. |
sctypeDict |
dict() -> new empty dictionary |
searchsorted |
Find indices where elements should be inserted to maintain order. |
select |
Return an array drawn from elements in choicelist, depending on conditions. |
set_printoptions |
Set printing options. |
setbufsize |
Set the size of the buffer used in ufuncs. |
setdiff1d |
Find the set difference of two arrays. |
seterr |
Set how floating-point errors are handled. |
seterrcall |
Set the floating-point error callback function or log object. |
setxor1d |
Find the set exclusive-or of two arrays. |
shape |
Return the shape of an array. |
shares_memory |
Determine if two arrays share memory. |
short |
Signed integer type, compatible with C ‘short’. |
show_config |
Show libraries and system information on which NumPy was built and is being used |
sign |
Returns an element-wise indication of the sign of a number. |
signbit |
Returns element-wise True where signbit is set (less than zero). |
signedinteger |
Abstract base class of all signed integer scalar types. |
sin |
Trigonometric sine, element-wise. |
sinc |
Return the normalized sinc function. |
single |
Single-precision floating-point number type, compatible with C ‘float’. |
sinh |
Hyperbolic sine, element-wise. |
size |
Return the number of elements along a given axis. |
sometrue |
Check whether some values are true. |
sort |
Return a sorted copy of an array. |
sort_complex |
Sort a complex array using the real part first, then the imaginary part. |
spacing |
Return the distance between x and the nearest adjacent number. |
split |
Split an array into multiple sub-arrays as views into ‘ary’. |
sqrt |
Return the non-negative square-root of an array, element-wise. |
square |
Return the element-wise square of the input. |
squeeze |
Remove axes of length one from ‘a’. |
stack |
Join a sequence of arrays along a new axis. |
std |
Compute the standard deviation along the specified axis. |
str_ |
A unicode string. |
subtract |
Subtract arguments, element-wise. |
sum |
Sum of array elements over a given axis. |
swapaxes |
Interchange two axes of an array. |
take |
Take elements from an array along an axis. |
take_along_axis |
Take values from the input array by matching 1d index and data slices. |
tan |
Compute tangent element-wise. |
tanh |
Compute hyperbolic tangent element-wise. |
tensordot |
Compute tensor dot product along specified axes. |
tile |
Construct an array by repeating A the number of times given by reps. |
timedelta64 |
A timedelta stored as a 64-bit integer. |
trace |
Return the sum along diagonals of the array. |
transpose |
Returns an array with axes transposed. |
trapz |
|
tri |
An array with ones at and below the given diagonal and zeros elsewhere. |
tril |
Lower triangle of an array. |
tril_indices |
Return the indices for the lower-triangle of an (n, m) array. |
tril_indices_from |
Return the indices for the lower-triangle of arr. |
trim_zeros |
Trim the leading and/or trailing zeros from a 1-D array or sequence. |
triu |
Upper triangle of an array. |
triu_indices |
Return the indices for the upper-triangle of an (n, m) array. |
triu_indices_from |
Return the indices for the upper-triangle of arr. |
true_divide |
Divide arguments element-wise. |
trunc |
Return the truncated value of the input, element-wise. |
typecodes |
dict() -> new empty dictionary |
typename |
Return a description for the given data type code. |
ubyte |
Unsigned integer type, compatible with C ‘unsigned char’. |
ufunc |
Functions that operate element by element on whole arrays. |
uint |
Unsigned integer type, compatible with C ‘unsigned long’. |
uint16 |
Unsigned integer type, compatible with C ‘unsigned short’. |
uint32 |
Unsigned integer type, compatible with C ‘unsigned int’. |
uint64 |
Unsigned integer type, compatible with C ‘unsigned long’. |
uint8 |
Unsigned integer type, compatible with C ‘unsigned char’. |
uintc |
Unsigned integer type, compatible with C ‘unsigned int’. |
uintp |
Unsigned integer type, compatible with C ‘unsigned long’. |
ulonglong |
Signed integer type, compatible with C ‘unsigned long long’. |
union1d |
Find the union of two arrays. |
unique |
Find the unique elements of an array. |
unique_all |
Find the unique elements of an array, and counts, inverse, and indices. |
unique_counts |
Find the unique elements and counts of an input array x. |
unique_inverse |
Find the unique elements of x and indices to reconstruct x. |
unique_values |
Returns the unique elements of an input array x. |
unpackbits |
Unpacks elements of a uint8 array into a binary-valued output array. |
unravel_index |
Converts a flat index or array of flat indices into a tuple of coordinate arrays. |
unsignedinteger |
Abstract base class of all unsigned integer scalar types. |
unstack |
Split an array into a sequence of arrays along the given axis. |
unwrap |
Unwrap by taking the complement of large deltas with respect to the period. |
ushort |
Unsigned integer type, compatible with C ‘unsigned short’. |
vander |
Generate a Vandermonde matrix. |
var |
Compute the variance along the specified axis. |
vdot |
Return the dot product of two vectors. |
vecdot |
Vector dot product of two arrays. |
vecmat |
Vector-matrix dot product of two arrays. |
vectorize |
Returns an object that acts like pyfunc, but takes arrays as input. |
void |
Create a new structured or unstructured void scalar. |
vsplit |
Split an array into multiple sub-arrays vertically (row-wise). |
vstack |
Stack arrays in sequence vertically (row wise). |
where |
Return elements chosen from ‘x’ or ‘y’ depending on ‘condition’. |
zeros |
Return a new array of given shape and type, filled with zeros. |
zeros_like |
Return an array of zeros with the same shape and type as a given array. |
numpy.ndarray
Member | Description |
---|---|
alignment |
The required alignment (bytes) of this data-type according to the compiler. |
base |
Returns dtype for the base element of the subarrays, regardless of their dimension or shape. |
byteorder |
Character indicating the byte-order of this dtype object. |
char |
A unique character code for each of the built-in types. |
descr |
‘__array_interface__’ description of the data-type. |
fields |
Dictionary of named fields defined for this type or None. |
flags |
Bit-flags describing how this data type is to be interpreted. |
hasobject |
Boolean indicating whether this dtype contains any reference-counted objects in any fields or sub-dtypes. |
isalignedstruct |
Boolean indicating whether the dtype is a struct which maintains field alignment. |
isbuiltin |
Integer indicating how this dtype relates to built-in dtypes. |
isnative |
Boolean indicating whether the byte order of this dtype is native to the platform. |
itemsize |
The element size of this data-type object. |
kind |
A character code (one of ‘biufcmMOSUV’) identifying the general kind of data. |
metadata |
None, or readonly dict of metadata (mappingproxy). |
name |
A bit-width name for this data-type. |
names |
Ordered list of field names, or ‘None’ if there are no fields. |
ndim |
Number of dimensions of the sub-array if this data type describes a sub-array, and ‘0’ otherwise. |
newbyteorder |
Return a new dtype with a different byte order. |
num |
A unique number for each of the 21 different built-in types. |
shape |
Shape tuple of the sub-array if this data type describes a sub-array, and ‘()’ otherwise. |
str |
The array-protocol typestring of this data-type object. |
subdtype |
Tuple ‘(item_dtype, shape)’ if this ‘dtype’ describes a sub-array, and None otherwise. |
to_device |
Tuple ‘(item_dtype, shape)’ if this ‘dtype’ describes a sub-array, and None otherwise. |
numpy.dtype
Member | Description |
---|---|
alignment |
The required alignment (bytes) of this data-type according to the compiler. |
base |
Returns dtype for the base element of the subarrays, regardless of their dimension or shape. |
byteorder |
A character indicating the byte-order of this data-type object. |
char |
A unique character code for each of the 21 different built-in types. |
descr |
‘__array_interface__’ description of the data-type. |
fields |
Dictionary of named fields for this type or None. |
flags |
Bit-flags describing how this data type is to be interpreted. |
hasobject |
Boolean indicating whether this dtype contains any reference-counted objects in any fields or sub-dtypes. |
isalignedstruct |
Boolean indicating whether the dtype is a struct which maintains field alignment. |
isbuiltin |
Integer indicating how this dtype relates to the built-in dtypes. |
isnative |
Boolean indicating whether the byte order of this dtype is native to the platform. |
itemsize |
The element size of this data-type object. |
kind |
A character code (one of ‘biufcmMOSUV’) identifying the general kind of data. |
metadata |
Either ‘None’ or a readonly dictionary of metadata. |
name |
A bit-width name for this data-type. |
names |
Ordered list of field names, or ‘None’ if there are no fields. |
ndim |
Number of dimensions of the sub-array if this data type describes a sub-array, and ‘0’ otherwise. |
newbyteorder |
Return a new dtype with a different byte order. |
num |
A unique number for each of the 21 different built-in types. |
shape |
Shape tuple of the sub-array if this data type describes a sub-array, and ‘()’ otherwise. |
str |
The array-protocol typestring of this data-type object. |
subdtype |
Tuple ‘(item_dtype, shape)’ if this ‘dtype’ describes a sub-array, and None otherwise. |
numpy.linalg
Member | Description |
---|---|
cholesky |
Cholesky decomposition. |
cond |
Compute the condition number of a matrix. |
det |
Compute the determinant of an array. |
diagonal |
Returns specified diagonals of a matrix (or a stack of matrices) x. |
eig |
Compute eigenvalues & right eigenvectors of a square array. |
eigh |
Return the eigenvalues and eigenvectors of a complex Hermitian (conjugate symmetric) or a real symmetric matrix. |
eigvals |
Compute the eigenvalues of a general matrix. |
eigvalsh |
Compute the eigenvalues of a complex Hermitian or real symmetric matrix. |
inv |
Compute the (multiplicative) inverse of a matrix. |
lstsq |
Return the least-squares solution to a linear matrix equation. |
matrix_power |
Raise a square matrix to the (integer) power ‘n’. |
matrix_rank |
Return matrix rank of array using SVD method. |
matrix_norm |
Computes the matrix norm of a matrix (or a stack of matrices) x. |
matrix_transpose |
Transposes a matrix (or a stack of matrices) x. |
multi_dot |
Compute the dot product of two or more arrays in a single function call, while automatically selecting the fastest evaluation order. |
norm |
Matrix or vector norm. |
pinv |
Compute the (Moore-Penrose) pseudo-inverse of a matrix. |
qr |
Compute the qr factorization of a matrix. |
svdvals |
Returns the singular values of a matrix (or a stack of matrices) x. |
slogdet |
Compute the sign and (natural) logarithm of the determinant of an array. |
solve |
Solve a linear matrix equation, or system of linear scalar equations. |
svd |
Singular Value Decomposition. |
tensorinv |
Compute the ‘inverse’ of an N-dimensional array. |
tensorsolve |
Solve the tensor equation ‘a x = b’ for x. |
trace |
Returns the sum along the specified diagonals of a matrix (or a stack of matrices) x. |
vecdot |
Computes the vector dot product. |
vector_norm |
Computes the vector norm of a vector (or batch of vectors) x. |
numpy.fft
Member | Description |
---|---|
fft |
Compute the one-dimensional discrete Fourier Transform. |
fft2 |
Compute the 2-dimensional discrete Fourier Transform. |
fftfreq |
Return the Discrete Fourier Transform sample frequencies. |
fftn |
Compute the N-dimensional discrete Fourier Transform. |
fftshift |
Shift the zero-frequency component to the center of the spectrum. |
helper |
Discrete Fourier Transforms — helper.py |
hfft |
Compute the FFT of a signal that has Hermitian symmetry, i.e., a real spectrum. |
ifft |
Compute the one-dimensional inverse discrete Fourier Transform. |
ifft2 |
Compute the 2-dimensional inverse discrete Fourier Transform. |
ifftn |
Compute the N-dimensional inverse discrete Fourier Transform. |
ifftshift |
The inverse of ‘fftshift’. Although identical for even-length ‘x’, the functions differ by one sample for odd-length ‘x’. |
ihfft |
Compute the inverse FFT of a signal that has Hermitian symmetry. |
irfft |
Computes the inverse of ‘rfft’. |
irfft2 |
Computes the inverse of ‘rfft2’. |
irfftn |
Computes the inverse of ‘rfftn’. |
rfft |
Compute the one-dimensional discrete Fourier Transform for real input. |
rfft2 |
Compute the 2-dimensional FFT of a real array. |
rfftfreq |
Return the Discrete Fourier Transform sample frequencies (for usage with rfft, irfft). |
rfftn |
Compute the N-dimensional discrete Fourier Transform for real input. |
numpy.random
The numpy.random
module is a NumPy sub-package, primarily used for generating random numbers and performing various statistical operations. The module provides a suite of functions that support many aspects of randomisation and probability distributions.
Member | Description |
---|---|
beta |
Draw samples from a Beta distribution. |
binomial |
Draw samples from a binomial distribution. |
bit_generator |
BitGenerator base class and SeedSequence used to seed the BitGenerators. |
bytes |
Return random bytes. |
chisquare |
Draw samples from a chi-square distribution. |
choice |
Generates a random sample from a given 1-D array |
default_rng |
Construct a new Generator with the default BitGenerator (PCG64). |
dirichlet |
Draw samples from the Dirichlet distribution. |
exponential |
Draw samples from an exponential distribution. |
f |
Draw samples from an F distribution. |
gamma |
Draw samples from a Gamma distribution. |
geometric |
Draw samples from the geometric distribution. |
get_state |
Return a tuple representing the internal state of the generator. |
gumbel |
Draw samples from a Gumbel distribution. |
hypergeometric |
Draw samples from a Hypergeometric distribution. |
laplace |
Draw samples from the Laplace or double exponential distribution with specified location (or mean) and scale (decay). |
logistic |
Draw samples from a logistic distribution. |
lognormal |
Draw samples from a log-normal distribution. |
logseries |
Draw samples from a logarithmic series distribution. |
multinomial |
Draw samples from a multinomial distribution. |
multivariate_normal |
Draw random samples from a multivariate normal distribution. |
negative_binomial |
Draw samples from a negative binomial distribution. |
noncentral_chisquare |
Draw samples from a noncentral chi-square distribution. |
noncentral_f |
Draw samples from the noncentral F distribution. |
normal |
Draw random samples from a normal (Gaussian) distribution. |
pareto |
Draw samples from a Pareto II or Lomax distribution with specified shape. |
permutation |
Randomly permute a sequence, or return a permuted range. |
poisson |
Draw samples from a Poisson distribution. |
power |
Draws samples in [0, 1] from a power distribution with positive exponent a — 1. |
rand |
Random values in a given shape. |
randint |
Return random integers from ‘low’ (inclusive) to ‘high’ (exclusive). |
randn |
Return a sample (or samples) from the “standard normal” distribution. |
random |
Return random floats in the half-open interval [0.0, 1.0). |
random_integers |
Random integers of type ‘np.int_’ between ‘low’ and ‘high’, inclusive. |
random_sample |
Return random floats in the half-open interval [0.0, 1.0). |
ranf |
This is an alias of ‘random_sample’. See ‘random_sample’ for the complete documentation. |
rayleigh |
Draw samples from a Rayleigh distribution. |
sample |
This is an alias of ‘random_sample’. See ‘random_sample’ for the complete documentation. |
seed |
Reseed the singleton RandomState instance. |
set_state |
Set the internal state of the generator from a tuple. |
shuffle |
Modify a sequence in-place by shuffling its contents. |
standard_cauchy |
Draw samples from a standard Cauchy distribution with mode = 0. |
standard_exponential |
Draw samples from the standard exponential distribution. |
standard_gamma |
Draw samples from a standard Gamma distribution. |
standard_normal |
Draw samples from a standard Normal distribution (mean=0, stdev=1). |
standard_t |
Draw samples from a standard Student’s t distribution with ‘df’ degrees of freedom. |
triangular |
Draw samples from the triangular distribution over the interval ‘[left, right]’. |
uniform |
Draw samples from a uniform distribution. |
vonmises |
Draw samples from a von Mises distribution. |
wald |
Draw samples from a Wald, or inverse Gaussian, distribution. |
weibull |
Draw samples from a Weibull distribution. |
zipf |
Draw samples from a Zipf distribution. |
numpy.polynomial
A sub-package for efficiently dealing with polynomials.
Member | Description |
---|---|
Polynomial |
Power series |
Chebyshev |
Chebyshev series |
Legendre |
Legendre series |
Laguerre |
Laguerre series |
Hermite |
Hermite series |
HermiteE |
HermiteE series |
numpy.strings
The numpy.strings module provides a set of universal functions operating on arrays of type numpy.str_
or numpy.bytes_
.
String operations | Description |
---|---|
add |
Add arguments element-wise. |
center |
Return a copy of a with its elements centered in a string of length width. |
capitalize |
Return a copy of a with only the first character of each element capitalized. |
decode |
Calls bytes.decode element-wise. |
encode |
Calls str.encode element-wise. |
expandtabs |
Return a copy of each string element where all tab characters are replaced by one or more spaces. |
ljust |
Return an array with the elements of a left-justified in a string of length width. |
lower |
Return an array with the elements converted to lowercase. |
lstrip |
For each element in a, return a copy with the leading characters removed. |
mod |
Return (a % i), that is pre-Python 2.6 string formatting (interpolation), element-wise for a pair of array_likes of str or unicode. |
multiply |
Return (a * i), that is string multiple concatenation, element-wise. |
partition |
Partition each element in a around sep. |
replace |
For each element in a, return a copy of the string with occurrences of substring old replaced by new. |
rjust |
Return an array with the elements of a right-justified in a string of length width. |
rpartition |
Partition (split) each element around the right-most separator. |
rstrip |
For each element in a, return a copy with the trailing characters removed. |
strip |
For each element in a, return a copy with the leading and trailing characters removed. |
swapcase |
Return element-wise a copy of the string with uppercase characters converted to lowercase and vice versa. |
title |
Return element-wise title cased version of string or unicode. |
translate |
For each element in a, return a copy of the string where all characters occurring in the optional argument deletechars are removed, and the remaining characters have been mapped through the given translation table. |
upper |
Return an array with the elements converted to uppercase. |
zfill |
Return the numeric string left-filled with zeros. |
Comparisons | Description |
---|---|
equal |
Return (x1 == x2) element-wise. |
not_equal |
Return (x1 != x2) element-wise. |
greater_equal |
Return the truth value of (x1 >= x2) element-wise. |
less_equal |
Return the truth value of (x1 <= x2) element-wise. |
greater |
Return the truth value of (x1 > x2) element-wise. |
less |
Return the truth value of (x1 < x2) element-wise. |
String information | Description |
---|---|
count |
Returns an array with the number of non-overlapping occurrences of substring sub in the range [start, end). |
endswith |
Returns a boolean array which is True where the string element in a ends with suffix, otherwise False. |
find |
For each element, return the lowest index in the string where substring sub is found, such that sub is contained in the range [start, end). |
index |
Like find, but raises ValueError when the substring is not found. |
isalnum |
Returns true for each element if all characters in the string are alphanumeric and there is at least one character, false otherwise. |
isalpha |
Returns true for each element if all characters in the data interpreted as a string are alphabetic and there is at least one character, false otherwise. |
isdecimal |
For each element, return True if there are only decimal characters in the element. |
isdigit |
Returns true for each element if all characters in the string are digits and there is at least one character, false otherwise. |
islower |
Returns true for each element if all cased characters in the string are lowercase and there is at least one cased character, false otherwise. |
isnumeric |
For each element, return True if there are only numeric characters in the element. |
isspace |
Returns true for each element if there are only whitespace characters in the string and there is at least one character, false otherwise. |
istitle |
Returns true for each element if the element is a titlecased string and there is at least one character, false otherwise. |
isupper |
Return true for each element if all cased characters in the string are uppercase and there is at least one character, false otherwise. |
rfind |
For each element, return the highest index in the string where substring sub is found, such that sub is contained in the range [start, end). |
rindex |
Like rfind, but raises ValueError when the substring sub is not found. |
startswith |
Returns a boolean array which is True where the string element in a starts with prefix, otherwise False. |
str_len |
Returns the length of each element. |
NumPy docs. numpy.org. numpy.org/doc/stable/reference/arrays.classes.html↩︎
Answers to exercises assume, where required, that numpy has been imported:
import numpy as np
Please note that some of the output and results have been formatted for better display.
Create a new 64 byte dtype
of float type, using a sized alias, and assign it to the variable t
. Print out the variable.
t = np.dtype('float64')
print(t)
Repeat the above dtype
creation, but instead using an equivalent native Python type.
t = np.dtype(float)
print(t)
Write a Python expression that calculates the area of a circle with radius of 30mm.
# mm units
radius = 30
# Formula: a = pi * r * r
area_mm_sq = np.pi * (radius**2)
# In units of mm-squared
print(area_mm_sq)
Convert the result of the area of the circle to units of cm2. Print the result.
area_cm_sq = area_mm_sq / (10**2)
print(area_cm)
Create a 2-D Python list of the test scores, with the student scores as “rows” in test order. Assign this list to the variable student_scores_list
. Print out this list.
student_scores_list = [
[1, 63.5, 56.0, 68.0, 73.5],
[2, 53.0, 77.5, 61.0, 83.0],
[3, 59.0, 79.0, 67.5, 70.0]
]
print(student_scores_list)
Using the list from Exercise 4-1, create a NumPy array assigned to student_scores_arr
, and explicitly assign it an appropriate floating point dtype
. Print out the array.
student_scores_arr = np.array(student_scores_list,
dtype=float)
print(student_scores_arr)
NumPy arrays must be of the same type, in this case whole numbers (integers) were promoted to floating point values.
Create a copy of student_scores_arr
whilst assigning it to a new variable. Change the type of the copied array to a suitable integer. What do the scores look like now? What effect did the conversion have on the values?
scores_arr_cp = scores_arr.astype(np.int_)
print(scores_arr_cp)
The effect of the conversion was to round down (take the floor of) the values.
Create a new array filled with ones, of the same dimensions as student_scores_arr
. Print the array. What is the dtype
of this array?
arr_ones = np.ones_like(student_scores_arr)
print(arr_ones)
Create an identity matrix of 4x4 size, and print the result.
arr_ident = np.identity(n=4, dtype=int)
print(arr_ident)
Design a suitable named compound type for student_scores_arr
(using sensible names without spaces), where the student id is an integer, and the scores are a floating point. Recreate the scores array to use this compound type. Print out the dtype
for the array. (Hint: use tuples as rows.)
t = np.dtype([
('student', 'int'),
('test1', float),
('test2', float),
('test3', float)])
# Note the use of tuples as rows..
student_scores_arr = np.array([
(1, 85.5, 90.0, 78.5, 92.0),
(2, 79.0, 88.5, 95.5, 87.0),
(3, 92.5, 87.0, 89.5, 91.0)],
dtype=t)
print(student_scores_arr.dtype)
Convert the structured array created in Exercise 4-6 to a recarray
array. Print out the 2nd column of the record array by name.
scores_rec = student_scores_arr.view(np.recarray)
scores_rec.test1
What is the shape of the NumPy array? Use the array’s shape
property to confirm your conclusion.
This is a 2x3 array. This can be confirmed using:
a.shape
What is the size of each element in this array, in bytes?
i = a.itemsize
print(i)
Use the appropriate inspection property to find the total bytes consumed by the array. How does this compare to the multiple of the previous exercise’s result by the total count of elements?
b = a.nbytes
print(b)
size
gives you the number of elements, therefore:
b = i * a.size
print(b)
What is the string name of the data type of this array?
a.dtype.name
If a third row containing the elements [7 8 9]
was added to the array, what would be the number of dimensions?
Adding a new row simply increases the size of the 0th dimension, so the number of dimensions remains unchanged at 2
. The shape, however, becomes (3,3)
.
a = np.array([(1, 2, 3), (4, 5, 6), (7, 8, 9)])
a.ndim
a.shape
Load the file into a NumPy array, excluding the header, and print the array.
# Set f to the file's path on your computer:
f = '/home/student_scores.csv'
a = np.loadtxt(f, delimiter=',', skiprows=1)
print(a)
Load the array instead using the compound type. Print the new array.
from numpy import genfromtxt
r = genfromtxt(f, delimiter=',', dtype=t, names=True)
print(r)
Convert the array you created in Exercise 6-2 to a recarray
, and print out a list of the student numbers using the ‘array.field’ notation.
r = r.view(np.recarray)
r.student_no
Get a list of (only) the scores of the 1st student.
scores_array[0, 1:]
Print out student IDs of the 1st two students.
scores_array[:2, 0]
Modify the score of the 2nd student in the second test to 87.
scores_array[1, 2] = 87
print(scores_array[1,:])
Print the scores of all students in the 3rd test.
scores_array[:, 3]
Modify the scores of all students in the 4th test to 75.
scores_array[:, 4] = 75
print(scores_array)
Print the scores of the 2nd student in the last two tests.
scores_array[1, -2:]
Modify the scores of all students in the 1st test to 70.
scores_array[:, 1] = 70
print(scores_array)
Print the ID and scores of the last student.
scores_array[-1,0]
scores_array[-1, 1:]
Repeat the selection from Example 7-8 but assign it to the variable sub_arr
. Change the first score of sub_arr
to 77. Is this sub-selection a view? Confirm by printing both sub_arr
and scores_array
to see if the original array has also been modified.
sub_arr = scores_array[-1, 1:]
sub_arr[0] = 77
print(sub_arr)
print(scores_array)
sub_arr
is a view, you can see that both arrays have been modified:
Retrieve an array of the scores only (no student id) and apply a conditional expression to return True|False for any scores over 80.
scores_only = scores_array[:, 1:]
scores_over_80 = scores_only > 80
print(scores_over_80)
Perform element-wise addition (+) and multiplication (*) of the two arrays and print the results.
# Element-wise addition using ufunc
addition_result = np.add(a, b)
print("Addition Result:", addition_result)
# Element-wise multiplication using ufunc
multiplication_result = np.multiply(a, b)
print("Multiplication Result:", multiplication_result)
Does the matrix dot
operation between the arrays result in a valid array? Print the result. Can you explain how the result was generated?
result = a.dot(b)
No. A dot
operation on a 1x3 matrix with another 1x3 matrix results in a scalar value. This can be depicted as follows:
Using dot
on two equal size 1-D arrays (n-vectors) performs a matrix dot-product (or scalar-product) operation. (This should not be confused with “scalar * array” or “array * array” arithmetic multiplication.) However, if one side of the dot
operation in NumPy is a scalar value, then it will revert to “scalar * array”, or “array * array” multiplication if the vectors are of different size.
Perform element-wise comparison (greater than, less than, equal to) between the elements of these arrays. Print the result of each comparison.
greater_than_result = np.greater(a, b)
less_than_result = np.less(a, b)
equal_to_result = np.equal(a, b)
print("Greater Than Result:", greater_than_result)
print("Less Than Result:", less_than_result)
print("Equal To Result:", equal_to_result)
Perform scalar division on the array a
with the value 5. Print the resulting array. What is the dtype
of the result?
result = a / 5
print(result)
print(result.dtype)
Create a larger NumPy array b
with dimensions (3, 3) containing random values. Then, add array a
to b
. Print the resulting array.
b = np.random.randint(0, 10, size=(3, 3))
result = a + b
print(b)
print(result)
Note, your final result will vary depending on the random values generated in b
. The result is determined by adding a
([1, 2, 3]
) to each row of b
.
What percentage of student scores are over 80?
scores_only = scores_array[:, 1:]
scores_over_80 = scores_only > 80
num_scores_over_80 = np.sum(scores_over_80)
perc_over_80 = (num_scores_over_80 / scores_only.size)
* 100
print(scores_over_80)
print("Percentage of student scores over 80:",
perc_over_80, "%")
Compute the mean and standard deviation of the NumPy array arr = np.array([1, 2, 3, 4, 5])
.
mean_value = np.mean(arr)
std_deviation = np.std(arr)
print("Mean:", mean_value)
print("Standard Deviation:", std_deviation)
Compute and print out the median and quartiles (25th and 75th percentiles) of the following NumPy array arr = np.array([10, 20, 30, 40, 50])
.
median_value = np.median(data)
quartiles = np.percentile(data, [25, 75])
print("Median:", median_value)
print("25th Percentile (Q1):", quartiles[0])
print("75th Percentile (Q3):", quartiles[1])
Compute and print out the correlation coefficient between the two NumPy arrays x
and y
.
corr_coefficient = np.corrcoef(x, y)[0, 1]
print("Correlation coefficient:", corr_coefficient)
The correlation coefficient is very near to -1, therefore there is a strong negative correlation; as x
increases, y
decreases linearly.
Convert the array a = np.array([[1, 2], [3, 4]])
a = np.array([[1, 2], [3, 4]])
a.transpose()
Is it possible to reshape the 2x2 array a = np.array([[1, 2], [3, 4]])
into a 4x1 array? If so what is the procedure? Assign this new array to variable b
.
a = np.array([[1, 2], [3, 4]])
b = np.reshape(a, (4,1))
print(b)
Flatten the arrays a
and b
from exercise 2, and combine them into a single 2x4 array assigned to c
.
a = np.ravel(a)
b = np.ravel(b)
c = np.vstack((a, b))
Use a rotate operation upon c
to convert it into a 2x4 array.
# Using `c` from previous exercise..
np.rot90(c, 1)
Convert the 1-D array a = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9])
into a 3x3 array sorted in reverse order.
The key is to reverse the array first:
a = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9])
a = np.flip(a)
np.split(a, 3)
Swap the case of the array ['Hello', 'WORLD', 'FROM', 'Python']
so that uppercase letters are converted to lowercase, and lower to upper.
a = np.array(['Hello', 'WORLD', 'FROM', 'Python'])
np.strings.swapcase(a)
Pad the array of numeric strings ['42', '97', '2005', '025']
with zeros, up to a width of 4.
a = np.array(['42', '97', '2005', '025'])
np.strings.zfill(a, 4)
Test whether the values in the array ['Los', 'Angeles', 'Year 2019']
consist only of alphabetic characters.
a = np.array(['Los', 'Angeles', 'Year 2019'])
np.strings.isalpha(a)
Find the length of each value in the previous array.
np.strings.str_len(a)