[翻译]python招聘指南

原文在这里toptal

##python专家or Snake in the Grass?

你已经发现什么样的表现是一个python高手。但是在你该如何确定你的top 1%精英候选者也是呢?尽管没有魔法或者万无一失的技术,但是确实有一些问题可以帮助你确定候选者对这门语言认识的深度。下面简要地列举了一些这样的问题。

重要的是,这些样题仅仅旨在作为指导。不是所有的值得招聘的“A”级候选人都能正确地回答所有这些问题。同样,能正确回答所有问题也不能保证他/她就是一个“A”级候选人。招聘是一门科学,同样也是一门艺术。

##杂草中的蟒蛇…

尽管确实最优秀的开发者不会浪费时间记住一些能轻易在语言手册或者API文档中找到的知识,但是,仍然有一些关键的特性和能力,任何编程语言的每个专家都能,并且应该
精通。这里就是一些python特定的例子。

###Q:为什么要用函数装饰器?请举例

装饰器本质上是用来修改,扩展函数或类定义的可调用Python对象。装饰器的一个美丽之处是,单个装饰器定义可作用于多种函数(或类)。大部分功能从而可以用装饰器实现,否则将需要大量的样板(或更糟多余的!)代码。例如Flask,使用装饰器的机制来添加新的端点到web应用程序。一些装饰器的更常见的用途包括:增加同步,类型强制执行,日志,或前/后条件到一个类或函数。

###Q:什么是lambda表达式,列表解析和生成器表达式?各有什么优势和适当的用途?

lambda表达式是一种快速创建单行匿名函数的方法。它的简单,内联的本质往往-虽然不是总是-可以让代码比用正常函数定义方法更加可读和简洁。但另一方面,它的内联本质也极大地限制了它能做的事情。Being anonymous and inline, the only way to use the same lambda function in multiple locations in your code is to specify it redundantly.

List comprehensions provide a concise syntax for creating lists. List comprehensions are commonly used to make lists where each element is the result of some operation(s) applied to each member of another sequence or iterable. They can also be used to create a subsequence of those elements whose members satisfy a certain condition. In Python, list comprehensions provide an alternative to using the built-in map() and filter() functions.

As the applied usage of lambda expressions and list comprehensions can overlap, opinions vary widely as to when and where to use one vs. the other. One point to bear in mind, though, is that a list comprehension executes somewhat faster than a comparable solution using map and lambda (some quick tests yielded a performance difference of roughly 10%). This is because calling a lambda function creates a new stack frame while the expression in the list comprehension is evaluated without doing so.

Generator expressions are syntactically and functionally similar to list comprehensions but there are some fairly significant differences between the ways the two operate and, accordingly, when each should be used. In a nutshell, iterating over a generator expression or list comprehension will essentially do the same thing, but the list comprehension will create the entire list in memory first while the generator expression will create the items on the fly as needed. Generator expressions can therefore be used for very large (and even infinite) sequences and their lazy (i.e., on demand) generation of values results in improved performance and lower memory usage. It is worth noting, though, that the standard Python list methods can be used on the result of a list comprehension, but not directly on that of a generator expression.

###Q:考虑下面初始化列表的两种方法和产生的列表。所产生的列表将如何不同,为什么要使用这一初始化方法而不是另一个?

>>> # INITIALIZING AN ARRAY -- METHOD 1
...
>>> x = [[1,2,3,4]] * 3
>>> x
[[1, 2, 3, 4], [1, 2, 3, 4], [1, 2, 3, 4]]
>>>
>>>
>>> # INITIALIZING AN ARRAY -- METHOD 2
...
>>> y = [[1,2,3,4] for _ in range(3)]
>>> y
[[1, 2, 3, 4], [1, 2, 3, 4], [1, 2, 3, 4]]
>>>
>>> # WHICH METHOD SHOULD YOU USE AND WHY?

虽然这两种方法乍一看产生的结果相同,但两者之间的差异极其显著。如你所料,方法2产生有3个元素的列表,每个元素本身就是一个独立的4元素的列表。然而,在方法1中,列表的元素(4元素列表)都指向同一个对象。这会导致我们最始料未及和最不希望的行为,如下所示。

>>> # MODIFYING THE x ARRAY FROM THE PRIOR CODE SNIPPET:
>>> x[0][3] = 99
>>> x
[[1, 2, 3, 99], [1, 2, 3, 99], [1, 2, 3, 99]]
>>> # UH-OH, DON’T THINK YOU WANTED THAT TO HAPPEN!
...
>>>
>>> # MODIFYING THE y ARRAY FROM THE PRIOR CODE SNIPPET:
>>> y[0][3] = 99
>>> y
[[1, 2, 3, 99], [1, 2, 3, 4], [1, 2, 3, 4]]
>>> # THAT’S MORE LIKE WHAT YOU EXPECTED!
...

###Q: 下面的第二个append()语句输出什么?

>>> def append(list=[]):
...     # append the length of a list to the list
...     list.append(len(list))
...     return list
...
>>> append(['a','b'])
['a', 'b', 2]
>>>
>>> append()  # calling with no arg uses default list value of []
[0]
>>>
>>> append()  # but what happens when we AGAIN call append with no arg?

如果一个函数参数的默认值是一个表达式,这个表达式仅被执行一次,而不是每次这个函数被调用都被执行。因此,一旦列表参数已经被初始化为了一个空列表,后续调用append函数时,如果不指定参数list,将会继续使用之前初始化的同一个列表。这会引起下面的意外行为:

>>> append()  # first call with no arg uses default list value of []
[0]
>>> append()  # but then look what happens...
[0, 1]
>>> append()  # successive calls keep extending the same default list!
[0, 1, 2]
>>> append()  # and so on, and so on, and so on...
[0, 1, 2, 3]

###Q:在上述问题中,如何修改append方法的实现来避免描述的不当行为?

下面的append方法的替代实现是避免上述问题答案描述的不当行为的一个方法:

>>> def append(list=None):
...     if list is None:
            list = []
        # append the length of a list to the list
...     list.append(len(list))
...     return list
...
>>> append()
[0]
>>> append()
[0]

###Q:如何用一行python代码交换两个变量的值?

考虑这个简单的例子:

>>> x = 'X'
>>> y = 'Y'

在很多语言中,交换x和y的值需要你进行以下的操作:

>>> tmp = x
>>> x = y
>>> y = tmp
>>> x, y
('Y', 'X')

但是在python中,你可以用一行代码完成swap功能(感谢元组隐式的打包和解包)

>>> x,y = y,x
>>> x,y
('Y', 'X')

###Q:下面代码中最后一个表达式的输出是什么?

>>> flist = []
>>> for i in range(3):
...     flist.append(lambda: i)
...
>>> [f() for f in flist]   # what will this print out?

在Python的闭包中,变量都是通过名字来绑定。因此,上面代码的输出如下:

[2, 2, 2]

可能并不是代码的作者预料的结果!

一个替代方法是:可以创建一个独立的函数,或者通过名字传递参数;比如:

>>> flist = []
>>> for i in range(3):
...     flist.append(lambda i = i : i)
...
>>> [f() for f in flist]
[0, 1, 2]

###Q: Python2和3之间的主要区别是什么?

尽管Python2现在被认为是遗留之物,但是对一个开发者来说,因为Python2使用广泛,知道Python2和Python3之间的区别是非常重要的。

这里是开发者应该了解的一些关键差异:

Text and Data instead of Unicode and 8-bit strings. Python 3.0 uses the concepts of text and (binary) data instead of Unicode strings and 8-bit strings. The biggest ramification of this is that any attempt to mix text and data in Python 3.0 raises a TypeError (to combine the two safely, you must decode bytes or encode Unicode, but you need to know the proper encoding, e.g. UTF-8)

This addresses a longstanding pitfall for naïve Python programmers. In Python 2, mixing Unicode and 8-bit data would work if the string happened to contain only 7-bit (ASCII) bytes, but you would get UnicodeDecodeError if it contained non-ASCII values. Moreover, the exception would happen at the combination point, not at the point at which the non-ASCII characters were put into the str object. This behavior was a common source of confusion and consternation for neophyte Python programmers.

print函数. print表达式已经被一个print()函数替代

xrange – buh-bye. xrange()不再存在(range()现在的效果和xrange()一样,except it works with values of arbitrary size)

API changes:

  • zip(), map() and filter() all now return iterators instead of lists

  • dict.keys(), dict.items() and dict.values() now return “views” instead of lists

  • dict.iterkeys(), dict.iteritems() and dict.itervalues() are no longer supported

  • Comparison operators. The ordering comparison operators (<, <=, >=, >) now raise a TypeError exception when the operands don’t have a meaningful natural ordering. Some examples of the ramifications of this include:

Expressions like 1 < ‘’, 0 > None or len <= len are no longer valid
None < None now raises a TypeError instead of returning False

  • Sorting a heterogeneous list no longer makes sense – all the elements must be comparable to each other

更多关于Python2和Python3的区别可以参考这里

###Q: Python是解释型的还是编译型的?

As noted in Why Are There So Many Pythons?, this is, frankly, a bit of a trick question in that it is malformed. Python itself is nothing more than an interface definition (as is true with any language specification) of which there are multiple implementations. Accordingly, the question of whether “Python” is interpreted or compiled does not apply to the Python language itself; rather, it applies to each specific implementation of the Python specification.

Further complicating the answer to this question is the fact that, in the case of CPython (the most common Python implementation), the answer really is “sort of both”. Specifically, with CPython, code is first compiled and then interpreted. More precisely, it is not precompiled to native machine code, but rather to bytecode. While machine code is certainly faster, bytecode is more portable and secure. The bytecode is then interpreted in the case of CPython (or both interpreted and compiled to optimized machine code at runtime in the case of PyPy).

###Q: 有哪些CPython的替代实现?什么时候,为什么会使用他们?

One of the more prominent alternative implementations is Jython, a Python implementation written in Java that utilizes the Java Virtual Machine (JVM). While CPython produces bytecode to run on the CPython VM, Jython produces Java bytecode to run on the JVM.

Another is IronPython, written in C# and targeting the .NET stack. IronPython runs on Microsoft’s Common Language Runtime (CLR).

As also pointed out in Why Are There So Many Pythons?, it is entirely possible to survive without ever touching a non-CPython implementation of Python, but there are advantages to be had from switching, most of which are dependent on your technology stack.

Another noteworthy alternative implementation is PyPy whose key features include:

Speed. Thanks to its Just-in-Time (JIT) compiler, Python programs often run faster on PyPy.

Memory usage. Large, memory-hungry Python programs might end up taking less space with PyPy than they do in CPython.

Compatibility. PyPy is highly compatible with existing python code. It supports cffi and can run popular Python libraries like Twisted and Django.

Sandboxing. PyPy provides the ability to run untrusted code in a fully secure way.

Stackless mode. PyPy comes by default with support for stackless mode, providing micro-threads for massive concurrency.

###Q: What’s your approach to unit testing in Python?

The most fundamental answer to this question centers around Python’s unittest testing framework. Basically, if a candidate doesn’t mention unittest when answering this question, that should be a huge red flag.

unittest supports test automation, sharing of setup and shutdown code for tests, aggregation of tests into collections, and independence of the tests from the reporting framework. The unittest module provides classes that make it easy to support these qualities for a set of tests.

Assuming that the candidate does mention unittest (if they don’t, you may just want to end the interview right then and there!), you should also ask them to describe the key elements of the unittest framework; namely, test fixtures, test cases, test suites and test runners.

A more recent addition to the unittest framework is mock. mock allows you to replace parts of your system under test with mock objects and make assertions about how they are to be used. mock is now part of the Python standard library, available as unittest.mock in Python 3.3 onwards.

The value and power of mock are well explained in An Introduction to Mocking in Python. As noted therein, system calls are prime candidates for mocking: whether writing a script to eject a CD drive, a web server which removes antiquated cache files from /tmp, or a socket server which binds to a TCP port, these calls all feature undesired side-effects in the context of unit tests. Similarly, keeping your unit-tests efficient and performant means keeping as much “slow code” as possible out of the automated test runs, namely filesystem and network access.


####[注意:这个问题针对的是同样有java开发经验的python开发者.]

###Q: What are some key differences to bear in mind when coding in Python vs. Java?

Disclaimer #1. The differences between Java and Python are numerous and would likely be a topic worthy of its own (lengthy) post. Below is just a brief sampling of some key differences between the two languages.

Disclaimer #2. The intent here is not to launch into a religious battle over the merits of Python vs. Java (as much fun as that might be!). Rather, the question is really just geared at seeing how well the developer understands some practical differences between the two languages. The list below therefore deliberately avoids discussing the arguable advantages of Python over Java from a programming productivity perspective.

With the above two disclaimers in mind, here is a sampling of some key differences to bear in mind when coding in Python vs. Java:

Dynamic vs static typing. One of the biggest differences between the two languages is that Java is restricted to static typing whereas Python supports dynamic typing of variables.

Static vs. class methods. A static method in Java does not translate to a Python class method.

In Python, calling a class method involves an additional memory allocation that calling a static method or function does not.

In Java, dotted names (e.g., foo.bar.method) are looked up by the compiler, so at runtime it really doesn’t matter how many of them you have. In Python, however, the lookups occur at runtime, so “each dot counts”.

Method overloading. Whereas Java requires explicit specification of multiple same-named functions with different signatures, the same can be accomplished in Python with a single function that includes optional arguments with default values if not specified by the caller.

Single vs. double quotes. Whereas the use of single quotes vs. double quotes has significance in Java, they can be used interchangeably in Python (but no, it won’t allow beginnning the same string with a double quote and trying to end it with a single quote, or vice versa!).

Getters and setters (not!). Getters and setters in Python are superfluous; rather, you should use the ‘property’ built-in (that’s what it’s for!). In Python, getters and setters are a waste of both CPU and programmer time.

Classes are optional. Whereas Java requires every function to be defined in the context of an enclosing class definition, Python has no such requirement.

Indentation matters… in Python. This bites many a newbie Python programmer.

The Big Picture

An expert knowledge of Python extends well beyond the technical minutia of the language. A Python expert will have an in-depth understanding and appreciation of Python’s benefits as well as its limitations. Accordingly, here are some sample questions that can help assess this dimension of a candidate’s expertise:

###Q: What is Python particularly good for? When is using Python the “right choice” for a project?

Although likes and dislikes are highly personal, a developer who is “worth his or her salt” will highlight features of the Python language that are generally considered advantageous (which also helps answer the question of what Python is “particularly good for”). Some of the more common valid answers to this question include:

便于使用和重构,感谢Python灵活的语法,让它在快速建模时非常有用。

更多简洁的代码,再次感谢Python的语法和众多Python库。(distributed freely with most Python language implementations).

A dynamically-typed and strongly-typed language, offering the rare combination of code flexibility while at the same time avoiding pesky implicit-type-conversion bugs.

免费且开源,还要我说更多吗?

With regard to the question of when using Python is the “right choice” for a project, the complete answer also depends on a number of issues orthogonal to the language itself, such as prior technology investment, skill set of the team, and so on. Although the question as stated above implies interest in a strictly technical answer, a developer who will raise these additional issues in an interview will always “score more points” with me since it indicates an awareness of, and sensitivity to, the “bigger picture” (i.e., beyond just the technology being employed). Conversely, a response that Python is always the right choice is a clear sign of an unsophisticated developer.

###Q: Python语言的有哪些缺陷?

对于初学者来说,如果你很了解一门语言,你就应该知道它的缺点,因此回答如“他没有任何我不喜欢的东西”或者“它没有缺点”都非常telling indeed。

这个问题两个最常见的有效答案(绝不是详细列表)是:

全局解释器锁(GIL)。 CPython(最常用的Python实现)是非完全线程安全的。为了支持多线程的Python程序,当前线程必须保持CPython提供的一个全局锁,之后才能安全地访问python对象。其结果是,不管有多少线程或处理器存在,在任意时刻只有一个线程被执行。相比较而言,值得注意的是,之前讨论的PyPy实现提供了对海量micro-threads并发的支持。

执行速度。 Python可能比编译语言要慢,因为它是被解释的。 (嗯,有点慢。查看之前我们针对这个问题的讨论。)
包起来

##总结

本文中所列举的问题和技巧可以帮助你识别真正的python高手。我们希望在你寻找顶尖python开发者的时候,这些问题和技巧能成为一个“从谷壳分离麦子”的有用工具。然而,要记住,这些仅仅应该是作为工具被纳入你的整体招聘工具箱和战略中。

另外,对于那些可能会错误地阅读本指南,希望学习如何捕捉爬行动物(对不起,伙计,一类错误的蟒蛇!)的人,我们推荐你来弗罗里达州野生动物基金会的Python挑战赛。