您现在的位置是：首页 > 其它

当前栏目

Jupyter Notebook的27个窍门，技巧和快捷键

技巧快捷键 27 Jupyter Notebook 窍门

2023-09-11 14:16:06 时间

0?wx_fmt=jpeg

Jupyther notebook ,也就是一般说的 Ipython notebook，是一个可以把代码、图像、注释、公式和作图集于一处，从而实现可读性分析的一种灵活的工具。

Jupyter延伸性很好，支持多种编程语言，可以很轻松地安装在个人电脑或者任何服务器上——只要有ssh或者http接入就可以啦。最棒的一点是，它完全免费哦。

0?wx_fmt=png

Jupyter 界面

默认情况下，Jupyter Notebook 使用Python内核，这就是为什么它原名 IPython Notebook。Jupyter notebook是Jupyter项目的产物——Jupyter这个名字是它要服务的三种语言的缩写：Julia，PYThon和R，这个名字与“木星（jupiter）”谐音。本文将介绍27个轻松使用Jupyter的小窍门和技巧。

◆ ◆ ◆

1.快捷键

高手们都知道，快捷键可以节省很多时间。Jupyter在顶部菜单提供了一个快捷键列表：Help Keyboard Shortcuts 。每次更新Jupyter的时候，一定要看看这个列表，因为不断地有新的快捷键加进来。另外一个方法是使用Cmd + Shift + P ( Linux 和 Windows下 Ctrl + Shift + P亦可)调出命令面板。这个对话框可以让你通过名称来运行任何命令——当你不知道某个操作的快捷键，或者那个操作没有快捷键的时候尤其有用。这个功能与苹果电脑上的Spotlight搜索很像，一旦开始使用，你会欲罢不能。

0?wx_fmt=gif

几个我的最爱：

如果你想在各种情形下（Notebook和Console）Jupyter都同样处理，用下面的几行简单的命令创建文件~/.ipython/profile_default/ipython_config.py即可实现：

在Help 菜单下，你可以找到常见库的在线文档链接，包括Numpy，Pandas，Scipy和Matplotlib等。

另外，在库、方法或变量的前面打上?，即可打开相关语法的帮助文档。

In [3]: ?str.replace()

Return a copy of S with all occurrences of substring old replaced by new. If the optional argument count is given, only the first count occurrences are replaced. Type: method_descriptor

◆ ◆ ◆

4.在notebok里作图

在notebook里作图，有多个选择：

- matplotlib （事实标准）（http://matplotlib.org/），可通过%matplotlib inline 激活，（https://www.dataquest.io/blog/matplotlib-tutorial/）
- %matplotlib notebook 提供交互性操作，但可能会有点慢，因为响应是在服务器端完成的。
- mpld3（https://github.com/mpld3/mpld3）提供matplotlib代码的替代性呈现（通过d3），虽然不完整，但很好。
- bokeh（http://bokeh.pydata.org/en/latest/）生成可交互图像的更好选择。
- plot.ly（https://plot.ly/）可以生成非常好的图，可惜是付费服务。

0?wx_fmt=png

◆ ◆ ◆

5.Jupyter Magic命令

上文提到的%matplotlib inline 是Jupyter Magic命令之一。

推荐阅读Jupyter magic命令的相关文档

（http://ipython.readthedocs.io/en/stable/interactive/magics.html），它一定会对你很有帮助。下面是我最爱的几个：

◆ ◆ ◆

6.Jupyter Magic-%env:设置环境变量

不必重启jupyter服务器进程，也可以管理notebook的环境变量。有的库（比如theano）使用环境变量来控制其行为，%env是最方便的途径。

In [55]: # Running %env without any arguments # lists all environment variables # The line below sets the environment # variable OMP_NUM_THREADS %env OMP_NUM_THREADS=4

%run 可以运行.py格式的python代码——这是众所周知的。不那么为人知晓的事实是它也可以运行其它的jupyter notebook文件，这一点很有用。

注意：使用%run 与导入一个python模块是不同的。

In [56]: # this will execute and show the output from # all code cells of the specified notebook %run ./two-histograms.ipynb

0?wx_fmt=png

◆ ◆ ◆ 8.Jupyter Magic-%load：从外部脚本中插入代码

该操作用外部脚本替换当前cell。可以使用你的电脑中的一个文件作为来源，也可以使用URL。

In [ ]: # Before Running %load ./hello_world.py In [61]: # After Running # %load ./hello_world.py if __name__ == "__main__": print("Hello World!")

In [62]: data = this is the string I want to pass to different notebook %store data del data # This has deleted the variable

不加任何参数， %who 命令可以列出所有的全局变量。加上参数 str 将只列出字符串型的全局变量。

In [1]: one = "for the money" two = "for the show" three = "to get ready now go cat go" %who str

有两种用于计时的jupyter magic命令： %%time 和 %timeit.当你有一些很耗时的代码，想要查清楚问题出在哪时，这两个命令非常给力。

仔细体会下我的描述哦。

%%time 会告诉你cell内代码的单次运行时间信息。

In [4]: %%time import time for _ in range(1000): time.sleep(0.01)# sleep for 0.01 seconds

CPU times: user 21.5 ms, sys: 14.8 ms, total: 36.3 ms Wall time: 11.6 s

%%timeit 使用了Python的 timeit 模块，该模块运行某语句100，000次（默认值），然后提供最快的3次的平均值作为结果。

In [3]: import numpy %timeit numpy.random.normal(size=100)

The slowest run took 7.29 times longer than the fastest. This could mean that an intermediate result is being cached. 100000 loops, best of 3: 5.5 µs per loop

◆ ◆ ◆ 12.Jupyter Magic-writefile and %pycat:导出cell内容/显示外部脚本的内容

使用%%writefile magic可以保存cell的内容到外部文件。而%pycat功能相反，把外部文件语法高亮显示（以弹出窗方式）。

In [7]: %%writefile pythoncode.py import numpy def append_if_not_exists(arr, x): if x not in arr: arr.append(x) def some_useless_slow_function(): arr = list() for i in range(10000): x = numpy.random.randint(0, 10000) append_if_not_exists(arr, x)

x = numpy.random.randint(0, 10000) append_if_not_exists(arr, x)

◆ ◆ ◆ 13.Jupyter Magic-%prun：告诉你程序中每个函数消耗的时间

使用%prun+函数声明会给你一个按顺序排列的表格，显示每个内部函数的耗时情况，每次调用函数的耗时情况，以及累计耗时。

In [47]: %prun some_useless_slow_function()

ncalls tottime percall cumtime percall filename:lineno(function) 10000 0.527 0.000 0.528 0.000 :2(append_if_not_exists) 10000 0.022 0.000 0.022 0.000 {method randint of mtrand.RandomState objects} 1 0.006 0.006 0.556 0.556 :6(some_useless_slow_function) 6320 0.001 0.000 0.001 0.000 {method append of list objects} 1 0.000 0.000 0.556 0.556 :1() 1 0.000 0.000 0.556 0.556 {built-in method exec} 1 0.000 0.000 0.000 0.000 {method disable of _lsprof.Profiler objects}

◆ ◆ ◆ 14.Jupyter Magic-用%pdb调试程序

Jupyter 有自己的调试界面The Python Debugger (pdb)（https://docs.python.org/3.5/library/pdb.html），使得进入函数内部检查错误成为可能。

Pdb中可使用的命令见链接（https://docs.python.org/3.5/library/pdb.html#debugger-commands）

In [ ]: %pdb def pick_and_take(): picked = numpy.random.randint(0, 1000) raise NotImplementedError() pick_and_take() Automatic pdb calling has been turned ON --------------------------------------------------------------------------- NotImplementedError Traceback (most recent call last) in () 5 raise NotImplementedError() 6 ---- 7 pick_and_take() in pick_and_take() 3 def pick_and_take(): 4 picked = numpy.random.randint(0, 1000) ---- 5 raise NotImplementedError() 6 7 pick_and_take() NotImplementedError: (5)pick_and_take() 3 def pick_and_take(): 4 picked = numpy.random.randint(0, 1000) ---- 5 raise NotImplementedError() 6 7 pick_and_take() ipdb ◆ ◆ ◆ 15.末句函数不输出

有时候不让末句的函数输出结果比较方便，比如在作图的时候，此时，只需在该函数末尾加上一个分号即可。

In [4]: %matplotlib inline from matplotlib import pyplot as plt import numpy x = numpy.linspace(0, 1, 1000)**1.5 In [5]: # Here you get the output of the function plt.hist(x) Out[5]: (array([ 216., 126., 106., 95., 87., 81., 77., 73., 71., 68.]), array([ 0. , 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1. ]), )

0?wx_fmt=png

In [6]: # By adding a semicolon at the end, the output is suppressed. plt.hist(x);

0?wx_fmt=png

◆ ◆ ◆ 16.运行Shell命令

在notebook内部运行shell命令很简单，这样你就可以看到你的工作文件夹里有哪些数据集。

In [7]: !ls *.csv

nba_2016.csv titanic.csv pixar_movies.csv whitehouse_employees.csv

◆ ◆ ◆ 17.用LaTex写公式

当你在一个Markdown单元格里写LaTex时，它将用MathJax呈现公式：如

$$ P(A \mid B) = \frac{P(B \mid A) , P(A)}{P(B)} $$

会变成

0?wx_fmt=png

◆ ◆ ◆ 18.在notebook内用不同的内核运行代码

如果你想要，其实可以把不同内核的代码结合到一个notebook里运行。

只需在每个单元格的起始，用Jupyter magics调用kernal的名称：

Jupyter的优良性能之一是可以运行不同语言的内核。下面以运行R内核为例说明：

简单的方法：通过Anaconda安装R内核

conda install -c r r-essentials

稍微麻烦的方法：手动安装R内核

如果你不是用Anaconda，过程会有点复杂，首先，你需要从CRAN安装R。

之后，启动R控制台，运行下面的语句：

install.packages(c(repr, IRdisplay, crayon, pbdZMQ, devtools)) devtools::install_github(IRkernel/IRkernel) IRkernel::installspec() # to register the kernel in the current R installation

要这么做，最好的方法事安装rpy2（需要一个可以工作的R），用pip操作很简单：

pip install rpy2

然后，就可以同时使用两种语言了，甚至变量也可以在二者之间公用：

In [1]: %load_ext rpy2.ipython In [2]: %R require(ggplot2) Out[2]: array([1], dtype=int32) In [3]: import pandas as pd df = pd.DataFrame({ Letter: [a, a, a, b, b, b, c, c, c], X: [4, 3, 5, 2, 1, 7, 7, 5, 9], Y: [0, 4, 3, 6, 7, 10, 11, 9, 13], Z: [1, 2, 3, 1, 2, 3, 1, 2, 3] }) In [4]: %%R -i df ggplot(data = df) + geom_point(aes(x = X, y= Y, color = Letter, size = Z))

0?wx_fmt=png

◆ ◆ ◆ 21.用其他语言写函数

有时候numpy的速度有点慢，我想写一些更快的代码。

原则上，你可以在动态库里编译函数，用python来封装…

但是如果这个无聊的过程不用自己干，岂不更好？

你可以在cython或fortran里写函数，然后在python代码里直接调用。

首先，你要先安装：

!pip install cython fortran-magic

我个人比较喜欢用Fortran，它在写数值计算函数时十分方便。更多的细节在（http://arogozhnikov.github.io/2015/09/08/SpeedBenchmarks.html）。

In [ ]: %load_ext fortranmagic In [ ]: %%fortran subroutine compute_fortran(x, y, z) real, intent(in) :: x(:), y(:) real, intent(out) :: z(size(x, 1)) z = sin(x + y) end subroutine compute_fortran In [ ]: compute_fortran([1, 2, 3], [4, 5, 6])

还有一些别的跳转系统可以加速python 代码。更多的例子见（http://arogozhnikov.github.io/2015/09/08/SpeedBenchmarks.html）

你可以在cython或fortran里写函数，然后在python代

◆ ◆ ◆ 22.支持多指针

Jupyter支持多个指针同步编辑，类似Sublime Text编辑器。按下Alt键并拖拽鼠标即可实现。

0?wx_fmt=gif

◆ ◆ ◆ 23.Jupyter外界拓展

Jupyter-contrib extensions（https://github.com/ipython-contrib/jupyter_contrib_nbextensions）是一些给予Jupyter更多更能的延伸程序，包括jupyter spell-checker和code-formatter之类.

下面的命令安装这些延伸程序，同时也安装一个菜单形式的配置器，可以从Jupyter的主屏幕浏览和激活延伸程序。

!pip install https://github.com/ipython-contrib/jupyter_contrib_nbextensions/tarball/master !pip install jupyter_nbextensions_configurator !jupyter contrib nbextension install --user !jupyter nbextensions_configurator enable --user

0?wx_fmt=png

◆ ◆ ◆ 24.从Jupyter notebook创建演示稿

Damian Avila的RISE（https://github.com/damianavila/RISE）允许你从已有的notebook创建一个powerpoint形式的演示稿。
你可以用conda来安装RISE：

conda install -c damianavila82 rise

或者用pip安装：

pip install RISE

然后运行下面的代码来安装和激活延伸程序：

jupyter-nbextension install rise --py --sys-prefix jupyter-nbextension enable rise --py --sys-prefix

◆ ◆ ◆ 25.Jupyter输出系统

Notebook本身以HTML的形式显示，单元格输出也可以是HTML形式的，所以你可以输出任何东西：视频/音频/图像。

这个例子是浏览我所有的图片，并显示前五张图的缩略图。

In [12]: import os from IPython.display import display, Image names = [f for f in os.listdir(../images/ml_demonstrations/) if f.endswith(.png)] for name in names[:5]: display(Image(../images/ml_demonstrations/ + name, width=100))

0?wx_fmt=png

我们也可以用bash命令创建一个相同的列表，因为magics和bash运行函数后返回的是python 变量：

In [10]: names = !ls ../images/ml_demonstrations/*.png names[:5] Out[10]: [../images/ml_demonstrations/colah_embeddings.png, ../images/ml_demonstrations/convnetjs.png, ../images/ml_demonstrations/decision_tree.png, ../images/ml_demonstrations/decision_tree_in_course.png, ../images/ml_demonstrations/dream_mnist.png]

本文来自云栖社区合作伙伴“大数据文摘”，了解相关信息可以关注“BigDataDigest”微信公众号

Jupyter Notebook：让编程就像搭积木当我们想要进行类似机器学习、大数据这样的分析编程时，如果是在例如 PyChram 这样的编译器上写，一般是要把整个代码文件写完，然后再运行。这样的缺点就是如果中间某行出现了 Bug，我们就要排除，然后再重新运行整个文档，或者是 Debug。但是如果是在 Jupyter Notebook 上，我们就可以一小块一小块的去运行，碰到不合预期的结果，可以很方便的回到特定的单元，去输出各种变量，排查错误。

猜你喜欢

十年之后再看“面向对象”
笔试题-2023-思远半导体-数字IC设计【纯净题目版】
（《机器学习》完整版系列）第10章降维与度量学习——10.3 主成分分析的优化目标（坐标变换的魔力）
org.springframework.beans简单解读
linux command ------ dmesg
DNS 收集与整理（持续收集中...）
随机获取一个集合（List, Set）中的元素，随机获取一个Map中的key或value
MySQL DDL表操作【入门到精通】
小程序订阅消息
不刷新改变URL: pushState + Ajax
快速排序
Docker问题——Docker安装报错 Containers Windows Feature is not available
声音大小对于测距数值的影响
俄研发新无线传电系统隔20cm保持80%传输效率
rinetd端口转发工具
软件工程基础
Win64 驱动内核编程-30.枚举与删除线程回调
Linux异步之信号(signal)机制分析
嗅探X-Windows服务按键工具xspy

相关主题

java_小技巧
python小技巧
MySQL注入技巧
Excel使用技巧
iOS开发技巧篇
50个CSS技巧
Unity3d-小技巧
Xcode 中的小技巧
PyCharm 使用技巧
sql编写小技巧
搜索技巧
电脑技巧1
调参技巧
JAVA常用小技巧
Python之技巧

zl程序教程

当前栏目

Jupyter Notebook的27个窍门，技巧和快捷键

相关文章