Python - Yahoo News Search Results

Astros bullpen catcher Javier Bracamonte takes the brunt of clubhouse prank

Carlos Correa wasn’t afraid to say good morning to Sunshine the Python. #AstrosST — Houston Astros (@astros) February 26, 2015 KISSIMMEE, Fla. – Astros bullpen catcher Javier Bracamonte showcased his hurdling abilities Thursday morning after he saw an Albino Burmese Python in the clubhouse at Osceola County Stadium. Astros manager A.J. Hinch set up the prank by asking ...

100kg python tamed by a woman

KUCHING: It was an odd sight, at least for Suhaini Abdullah, to see a woman dealing with a 1.52metre-long python, which had devoured two of his pigeons in Kenyalang Park here today. Despite the shocking discovery of the 100kg snake, Suhaini, 60, was more interested on the skills of the woman, who is an officer with the Civil Defense Department in taming the slithering reptile. According to ...

3.9-m python turned over to DENR

The Agusan National High School in Butuan City has turned-over a 3.9-meter reticulated python (Python reticulatus) to the regional office of the Department of Environment and Natural Resources (DENR-13) through its Enforcement Division and Technical Services. Officials of the DENR Enforcement Division said the python was the largest snake rescued so far.

Microsoft embraces Python, Linux in new big data tools

Continuing its quest to make Microsoft Azure comfy for the non-Windows world, Microsoft just launched a preview of its Hadoop-based cloud tool (HDInsight) that runs on Linux. It’s also making its Azure ...

Review: Monty Python's Spamalot, Grand Opera House, York

SPAM, spam and more spam. We like Monty Python's Spamalot rather a lot in York, where the Eric Idle and John Du Prez Broadway show has returned for a third time after earlier touring runs in 2010 and 2012.

Monty Python Star Terry Jones to Visit Bulgaria in March

British comedian Terry Jones will attend this year's Sofia International Film Festival , starting March 5 in Bulgaria's capital Sofia. Over the course of the past 19 years, the audience of the festival has had the opportunity to get to know better not only the Monty Python movies, but also the creative teams behind the scenes. Nineteen years ago, Terry Jones attended the festival himself and ...

Monty Python Star Terry Jones Visits Bulgaria in March

British comedian Terry Jones will attend this year's Sofia International Film Festival , starting March 5 in Bulgaria's capital Sofia. Over the course of the past 19 years, the audience of the festival has had the opportunity to get to know better not only the Monty Python movies, but also the creative teams behind the scenes. Nineteen years ago, Terry Jones attended the festival himself and ... Announces New Python API for Sparkling Water

Today, the leader in open source machine learning for smarter applications, announces the availability of the company’s Sparkling Water™ for Python developers. Through the n

Python bites V8 Supercars champ Whincup

V8 Supercars champion Jamie Whincup got more than he bargained for at a promotional event on Thursday when a black-headed python sank its fangs into him. The sport's undisputed number one, a six-time winner of the touring car championship, had renewed his contract with Red Bull Racing until 2018 and headed to Sydney's Taronga Zoo for the official announcement of the deal. The python was draped ...


recent bookmarks tagged Python

Extract text from any document; no muss, no fuss. | Datascope

Posted on 18 January 2038


Posted on 28 February 2015

Sigil Ebook - Sigil is a multi-platform EPUB ebook Editor

Posted on 27 February 2015

Python Test - Online Hands-on Programming Skills Test & Certification from ExpertRating

Posted on 27 February 2015


Posted on 27 February 2015

Noodle :: - Powered by Django

Posted on 27 February 2015

Iniciando com Django Rest Framework e AngularJS / Mateus Pádua Web Developer

Posted on 27 February 2015

Short Course: Introduction to Python and Pandas (for Data Munging and Machine Learning) | Jeremy Chen's Website

Posted on 27 February 2015

Interfaces gráficas con wxPython

Posted on 27 February 2015

5 Projects To Speed Up Python's Performance

Posted on 27 February 2015

Top Answers About Python (programming language) on Quora

Top Answers About Python (programming language)

Does Google hire programmers who use Python as their main programming language?

Can you get into Google as a smart programmer whose main language is Python? Yes. That said, you're going to want to learn C++ if you're planning for a career in Google, because it's the most respected of the "Google Languages" (Go was too new for me to evaluate in that regard, when I was there) and the machine learning projects are most likely going to be using it. Python's reputation is that it's not fit for production, and Java would make you a Java programmer. (Doing Java at Google-- at least, as of my experience which is 3 years dated, but I doubt much has changed-- means actually writing Java, not Scala or Clojure, so the cool kids write C++ and you should too.) The good news is that Google C++ is a lot more civilized than C++ in the wild. Google has some rotting projects but, averaged across the company, code quality is pretty damn high there. So the C++ that you encounter will probably be better than what you'd expect in a C++ shop.

See question on Quora

Posted on 10 February 2015

Why are IT systems in big enterprises usually built using Java, instead of Python or JavaScript?

Questions like this are usually asked by those with limited experience in large systems development, often after having just learned programming or done some small project work in the first languages they've acquired knowledge in.

For starters, neither Python or JavaScript are modern languages compared to Java.  JavaScript is only a couple of years younger than Java and before that it was called LiveScript.  Python is OLDER than Java by several years.

"Modern" does not mean "better".

Talking to other systems is a function of protocols, not languages.  You need to understand that distinction.  As long as you can speak the protocol, like HTTP or SOAP, you choose the language best suited for the job.

Java is NOT legacy.  It is mainstream.  And it is mainstream for a reason.  Virtually all of the common enterprise tasks one might want to perform is available in Java either natively or through many frameworks. 

It can scale and has many years behind it in how to do this.  That is huge in large enterprises with high volume systems.  JavaScript libraries are playing catch-up here and it will likely 5-10 years before they work out all of the kinks in doing this well and the current framework wars that are being fought right now shake out winners and defacto standards emerge.

What about skillsets?  Too many people who prattle on about JavaScript and Python fail to consider that.  Companies need to hire people.  The number of skilled people determines how hard and how expensive that task is going to be.  Java skills are easier to find.  Paradoxically, they are also harder to find because the demand for skilled Java developers is so high that it is hard to find them because they don't move around much!

Java has had a long time to figure out what works well in a modern web development and service development context.  Python has a couple of well-known options like Django or TurboGears. 

JavaScript is all over the map right now playing the "let's reinvent Java because it is cool to do so" game.  JavaScript was never meant to be a back-end language and adapting it to that role is going to take time.  Hell, .NET developers are have been seeing the "new" MVC framework from Microsoft in the past couple years.  Java web developers are rolling on the floor howling with laughter at that one!  We were doing that 15 years ago and have used that time to refine what works and what doesn't!  Microsoft and the "modern" languages will be playing catch-up for sometime.

Managing large Java codebases is well known at this point.  It inherited that from its C ancestry.  Source control, build and deployment pipelines and most development methodologies like Agile have their roots in the Java space.

Legacy?  I sneer at that.  Give me a couple of good Java developers and I can probably wipe the floor with anyone using Python or JavaScript to roll out an idea and do it in the same amount of time and richness of capability.  With the advantage than when the idea needs to grow, I know my enterprise-grade solution will be able to scale because I'll have engineered it that way applying years of long experience.  Because I won't be reinventing the wheel when I crack out my JDBC frameworks, NoSQL frameworks, JSF2, Primefaces, Web Services components and the like.  You'll still be writing orchestration code or a front-end widget while I'm negotiating with my customers on what text their labels should have, hard work long behind me.

And I say this as some who is fluent in Python and JavaScript and use them in production systems.  And believe it or not, despite Java's strong typing, it is actually as dynamic in its behavior as Python or JavaScript are.  Java Reflection is the core of virtually every modern Java framework and we've been doing runtime inspection, runtime dependency injection and dynamic typing and runtime type determination for well over a decade.  Almost every piece of Java code I write, especially in a web application, is more-or-less a dynamically typed solution (i.e. JavaBean properties and dependency injected classes).

Languages are a means to an end.  They do not exist solely to justify themselves.  They are tools.  You're writing business solutions, not programming solutions. This is the first great epiphany you must come to understand.

See question on Quora

Posted on 18 January 2015

Why are IT systems in big enterprises usually built using Java, instead of Python or JavaScript?

At Spotify, we use Java extensively in the backend. This is not for legacy reasons, it's an active choice. We use Python too, but we have moved more and more to Java. The reason is that Java is much easier to get to perform well. Python is easy to write initially, but getting it to perform well when being hammered by 15 million paying users is another.

I personally don't understand how a medically sane person can like the Java syntax. However, no intelligent person can deny that the JVM is pretty darn good. It is fast, well-tested, well-documented and under active development. This cannot be said about many tools in software development.  

We used to have quite a bunch of C++ services, but while you can get C++ very fast too, it's harder to write, especially if you want the code to be maintainable. Java is a compromise that hits a sweet-spot for us.

Clojure is gaining tractions at Spotify, many new services are written in it, but it's not as wide-spread yet. While Clojure is certainly a better language, Java has the advantage of being non-weird. Java is an uncontroversial programming language that all experienced programmers can jump into with little effort, and that is a big advantage.

See question on Quora

Posted on 17 January 2015

Why are IT systems in big enterprises usually built using Java, instead of Python or JavaScript?

Python and JavaScript are qualitatively different languages from Java. And I would certainly argue against JavaScript being "more modern" than Java: they come from the same time.

Python and JavaScript are both interpreted languages (yes, I know Python complies to .pyc) whereas Java is a compiled language. Java runs under the JVM, which is a carefully designed secure environment; Python and JavaScript are deliberately designed as insecure "you can do anything" languages. Java has language constructs designed to make managing large projects with millions of lines of code possible; neither Python nor JavaScript do. My experience with Python is that it is not suitable for large projects.

If I wanted a "more modern" language than Java, I would probably go for something like Scala.

See question on Quora

Posted on 15 January 2015

Why are IT systems in big enterprises usually built using Java, instead of Python or JavaScript?

  1. Many companies have systems that have to maintained for a long time, yet they do not have dedicated staff to maintain them. Rather, the system is written by some contractors, then just sits there until something new is needed, when a different set of contractors comes in, etc. With that approach, it is important that you use something standardized that is popular for this general type of system, so you can always find somebody to maintain it for you. Java and C# are very popular for writing big business automation systems, so there are many contractors who can write big business automation systems in them, so they remain popular for writing big business automation systems. This becomes a self-fulfulling prophecy, but that makes it no less convincing of an argument.
  2. For similar reasons, it is important that you use something stable. With many dynamic/scripting languages like the ones you mentioned, 3 years is considered more than adequate notice to deprecate a language or library feature. With Java, on the other hand, you can still run a 10-year old program without modification. This is often very important to people doing business automation.
  3. The presence of a big corporation backing a language and the associated set of libraries and tools reassures decision-makers in large corporations that support is good and won't go away soon. (Whether this is actually true is a whole 'nother question.) Python, PHP, and Ruby are all originally hobby/academic projects, and have grown some amount of corporate support, but mostly from rather small firms that do not impress your typical Fortune 500 CIO. C# is backed by Microsoft, and Java comes out of Sun, which may have gone under, but is still backed heavily by Oracle, IBM, and others.
  4. Big business automation projects require different libraries and frameworks than typical dynamic web sites do. You want to be able to talk to Oracle and SAP, for instance. These sorts of frameworks and libraries tend to be for Java or C#, and this too is self-perpetuating.
  5. Sometimes, Java or C# may actually be a technically better alternative. I mention this argument last, but it's not just to point out the logical possibility. For instance, Java has a decent threading model and there exist high-performance concurrent data structure libraries. The standard Python implementation has poorly implemented threading, and the standard PHP implementation has for all practical purposes no threading at all. Also, strong typing and compile-time name resolution, while slowing down exploratory programming, do increase the number of bugs that can be statically caught.
And now an admittedly slightly off topic remark: The question mentions languages with a lot of compile-time checking that are tedious to use, like Java and C#, and languages with very little compile-time checking that are not tedious to use, like Python and Ruby. For completeness, it must be said that the amount of compile time checking a language does need not necessarily correlate directly with the amount of boilerplate that you have to type to help it do so. Among the languages that have very tight compile-time checking but that are considerably cleverer than Java about inferring what it is they are supposed to check, ML (OCaml, Standard ML, F#), Haskell, and Scala are especially worth knowing about. Among these, Scala has the best chances to become truly mainstream, because it integrates so well with the mature and open source Java runtime and libraries.

See question on Quora

Posted on 15 January 2015

How many boring steps in programming were there for you, before it became exciting?

I got stuck playing Grand Theft Auto: San Andreas on my PC! I had to figure out a way to get rid of the update to unlock my old saves. I ended up messing with one piece of the source code that was open to modify, it was written in C (If I only had known what C is).

I went to buy this book (This exact copy) for $75. In North Iraq $75 is a lot of money:

My cousin, Misho, who was an actual Engineer at the time told me few things:

  1. First, that is C++, you wanted C, those are not that close. AND no the ++ doesn't mean a better version. (My logic back then :/ )
  2. Second, this is for someone who can understand at least College level English, wait do you even know what a TextBook is Yad?
  3. Third, why did you break the game again? It doesn't look like that you are going to be playing it anytime soon.

Hence, the journey started my friend! I went on an epic mission to fix the game back! Didn't know what was coming, it got dark really fast.

I ended up making a MOD on the game and never being able to fix it. I downloaded the free available 3D car models and added them to the game.
My first 101 programming project as a 16 years old (It was one of the most exciting things I have ever done in my life).

When I showed the game Mod to my friends, the reaction was something like this:

For those who are interested here is what the game ended up like:
Hitman: Blood Money MOD in GTA :):

I found a Forum that gave all the Car Models for free and I gave them my Mod for free. Back then startup tactics was such simple!

See question on Quora

Posted on 14 January 2015

Which language is best, C, C++, Python or Java?

If you are writing an operating system, I suggest you use C.
If you are writing a very complex application where execution speed is extremely important, I suggest you use C++.
If time to market is key, but execution speed is not important, I suggest you use python.
If your boss told you: "do it in Java or you are fired" I suggest you use Java and look for a better workplace.

See question on Quora

Posted on 7 January 2015

Do you really need to learn C to learn C++, Java and Python?

No, you don't. Many introductory programming courses are taught in Java or Python, and no knowledge of C is expected. My guess is that most practicing programmers in Java and Python would take quite some time to become productive in C. (Not sure about C++.)

That being said, if you learn C, you will learn some important low-level details about how, for example, data are stored in memory and how memory is managed. This may help you understand design decisions and performance characteristics associated with other languages.

See question on Quora

Posted on 3 January 2015

Why do we return 0 to the OS when we exit with no errors, but boolean functions within the code generally return 1 (true) to indicate all is fine?

This is a tradition, which goes all the way back to the early 1960s.  A return code of "0" means OK.  The reason for this is that a register setting that is 0 is easy to test for with a very fast machine instruction.   The test sets a hardware flag that can be branched, whether BCT, BZ, JZ, BEQ depending on computer architecture.   Dropping through the test (non-zero) would go into the error handling part of your code that tests for specific values.  The register which contained the return code was always the same by convention and would contain the value after the registers were restore before the return.  There was no such thing as Try/Catch to handle errors, only return codes from procedures.

See question on Quora

Posted on 11 December 2014

Why do we return 0 to the OS when we exit with no errors, but boolean functions within the code generally return 1 (true) to indicate all is fine?

Because the exit code is answering the question "were there any problems?" as opposed to "was the program successful?".

Moreover, it's actually more than just a boolean: the number returned is a code that can specify what sort of error it was. Depending on the program, an exit code of 1 can be very different from 255.

The neat thing is that this approach works as both a boolean and a richer code, at least in C. By answering what is, in essence, a negative question, it elegantly covers two different use cases at once.

See question on Quora

Posted on 11 December 2014

What should I learn C++ or Python?

I would suggest learning Python first and then tackling C++ later.

Python abstracts you from a lot of the messier details of the machine and lets you focus on programming. First you want to get good at just programming in general and getting stuff done. Getting stuff done helps you get a job.

Then later on comes refining your craft. At this point you learn C++ and you start learning on what happens behind the scenes. C++ will make you a stronger developer overall. You should know that only a minority of developers actually do C++ well. It's not for the faint of heart. It takes years and years.

See question on Quora

Posted on 28 November 2014

What should I learn C++ or Python?

You will hear a lot of suggestions to learn C and C++ first.  THose folks might be sincere, or they might just be trying to keep you out of the job market and competing with them.

Most schools have gone to teaching Java or Python, after a few horrible decades teaching C and C++.    C and C++ are like running chainsaws, and you don't hand a running chainsaw to a blindfolded toddler.

If you learn Python first, you will avoid thousands of gumption-traps in C and C++.   Random crashes.  Incomprehensible compile errors.  Undebuggable crashes on code that looks just perfect.  Pages of gibberish for error messages on your first attempt to use the std stuff.    Just horrible place to learn anything.

The only up-side is if you can learn C and C++ and write non-trivial working programs, that's QUITE an accomplishment.   Anything else will seem easy by comparison.

So, you decide, learn to swim in the shallow end of the pool, or the 5,000 foot deep end?     You decide.   No lifejackets either.

See question on Quora

Posted on 21 November 2014

Does Python have a future?

from __future__ import braces #Seriously, try it !

Python is one of the most widely adopted general-purpose modern scripting-languages. (Javascript is used more by pure volume, but is fairly rare for anything other than webpages) It seems a pretty safe bet that Python is here to stay over the next decade.

Longer term there's no way to know for sure with any language, but it's not really worth worrying about because IF Python is ever supplanted by a newer shinier language, it's almost guaranteed that most of the things you learn in Python will transfer easily to the new language.

When you're learning Python (or any language) the specifics of the language is not where most of your effort will be spent anyway. Instead you'll learn a lot of general principles, and those apply no matter what language you'll use for implementing them a few decades from now.

See question on Quora

Posted on 4 November 2014

Why is it easier to learn a programming language once you already know one?

  1. Programming languages often address the same challenges, sometimes in similar ways. So, you know what to expect and can understand things in comparison.
  2. Programming languages share some of their core concepts and some fraction of their syntax.
  3. Your brain adapts to the process of learning a new language.
  4. The mechanics of using programming languages are not very language-dependent (at least for within the same language type), so you become more efficient in completing basic tasks, such as editing source code, and in debugging.

See question on Quora

Posted on 29 October 2014

As a starting Python programmer I see a lot of praise for the Python language (and so far I can only agree). Isn't there anything bad to say about it? What is a real con?

While the good outweigh the bad for most cases, there are definitely some bad parts about Python:

- It's slow. That is both dynamically typed and interpreted means that performance takes a hit.

- The Global Interpreter Lock (GIL) makes it hard to do advanced operations with asynchronous programming.

- `print()` doesn't require parentheses, which is inconsistent with the rest of the language.

- Though everything is an object, there are a number of builtin functions which make the language inconsistent. For example `[1, 2, 3].len` would be more consistent than `len([1, 2, 3])`. Ruby is not inconsistent in this way.

- {'a': 1, 'b': 2} is how you define a dictionary. {1, 2} is how you define a set. What does {} mean? (A dictionary. Dictionary notation came first.)

- (1, 2,) defines a tuple. (1, 2) defines a tuple. (1,) defines a tuple. (1) is just 1. (,) is a syntax error. () defines a tuple.

-  ({} == []) != (bool({}) == bool([]))

- Since False == 0 and since Python is dynamically typed, you can do nonsensical operations like [1, 2, 3] * False (which equals []). In a sane typed language, that would probably throw an error.

- There is no good way to add infix operators.

- Python sets hard limits on the stack height, which can be a problem if you are doing anything recursive in nature. Technically, you can rewrite recursive algorithms in an iterative form to get around this issue, but this is impractical for complex functions like the Ackermann function.

- Guido dislikes reduce (a higher level function) and its use is discouraged. Unfortunately, this means that anytime you have a function which could benefit from being abstracted, you cannot abstract it. This is basically a forced design pattern.

- There is no way to express a `do-while` statement, which leads to design patterns.

- Python's packaging system is extremely complicated. For starters, you need to install pip yourself using easy_install. Distribution is simple enough to get started with, but if you need to do anything complex you will need to study the history of the various options (distutils, setuptools, etc).

- Certain libraries in the standard library are showing their age, but there is nothing in the documentation that makes that clear. For example, imaplib fails to intelligently parse responses in the IMAP protocol, meaning that you sometimes need to construct data structures from strings represented lists. (imaplib was written in 1997, before Python 1.5 was released.)

- List comprehensions leak scope. For example, `[x for x in xs]` will put `x` ins scope. This can be dangerous if you had previously defined `x`.

- Python relies heavily on idioms. To the master this is no problem, but to the novice you have to discover the idioms. For example, to repeat a block `n` times you want to know the idiom `for _ in range(n):`.

- Multi-line `if` statements are hard to read. Consider:

if (collResv.repeatability is None or 
    collResv.rejected = True

- Subclass relations aren't transitive. [1]

- When using byte strings and Unicode strings, Python does implicit conversions which can be confusing if you don't know what's happening. For example,

>>> "Hello " + u"World"
u'Hello World'
>>> "Hello\xff " + u"World"
Traceback (most recent call last):
UnicodeDecodeError: 'ascii' codec can't decode byte 0xff in position 5: ordinal not in range(128)


>>> "foo" == u"foo" True
>>> "foo\xff" == u"foo\xff"
__main__:1: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal False

[1]: Python Subclass Relationships Aren't Transitive

See question on Quora

Posted on 19 October 2014

Why is tail recursion optimisation not implemented in languages like Python, Ruby, and Clojure? Is it just difficult or impossible?

There are different issues at play for the various languages.


For Clojure, the difficulty is with the JVM that does not support proper tail calls at a low level. It is the Java Virtual Machine, after all! People almost never use recursion in Java. I think Clojure could have proper tail calls if it used its own calling convention, but it wants to be easily compatible with existing Java code and so sticks with the standard Java calling convention.

Scala, also being on the JVM, has a very similar story to Clojure. The difference is that the Scala compiler is intelligent enough to optimize a directly recursive tail call into a loop. However, this still misses out on mutually recursive functions as well as other, potentially non-recursive uses for tail calls. This makes certain functional programming patters, like continuation passing style and certain monads, much harder to use effectively.

There's been some thought towards supporting a
tailcall invoke
instruction in the JVM bytecode for full proper tail calls. This proposal would rectify the problem with implementing tail calls, but I'm not sure how likely it is to actually get implemented.

Besides technical issues on the JVM, people in the Clojure community also think that making tail calls explicit is important from a language design point of view. It's like an extra check to ensure that the code you expect to have a proper tail call actually does. Personally, this reasoning feels a bit like "sour grapes" to me, especially since their approach with
doesn't scale to more advanced functional patterns, just like Scala.


Rust also doesn't support proper tail calls because of its calling convention. Rust aims to be fully compatible with C, and the C calling convention was naturally not designed with tail calls in mind! Rust could not maintain tail calls, C compatibility and satisfactory performance at the same time, so tail calls had to go.

Another common reason to avoid tail calls is that they compromise stack traces. The whole point of a proper tail call is not to use the stack, so standard debugging tools that rely on stack frames won't work properly. This makes languages that heavily rely on stack traces for debugging leery of tail calls:
Some languages really like their stack traces...

This is a difficult problem. It's certainly possible to instrument tail calls in useful ways, but it's very hard to do that without sacrificing performance and without breaking compatibility with other languages and tools.

Stack traces are one of the reasons that Python does not support tail calls. However, the main reason is that Guido does not like recursion (!) and wants to encourage people not to use it. I don't even want to know what he thinks about continuation-passing style!

Some languages don't support tail calls just because they're hard to implement. I think this is the case for Ruby: it wants to support a bunch of different platforms like JRuby and so on, but tail calls on the JVM are difficult (as covered above). So instead, Ruby has tail call elimination on some implementations but not others, which means you can't really rely on it.

JavaScript doesn't support tail calls because it's a complete mess of a language with a dysfunctional standardization process. Happily, they're slated for ES6, so future versions will actually have them! It's already beating the rest of the languages I talked about, most of which should feel ashamed.

So: languages don't support tail calls for a bunch of reasons. Some of these are practical effects of legacy tools and software, others are products of short-term thinking. To a large extent, it's also because there has not been much pressure for it: the deviants that actually wanted tail calls could always go over to a real functional language like Scheme or ML or Haskell. Happily, this last one is changing as functional programming gains more and more market share.

See question on Quora

Posted on 22 June 2014

Will Python suffer the same fate as Perl?

Not yet.

I think it turns out that Python 3 was a bad move strategically. But it's not the disaster that Perl 6 was because it noticably "exists". Whereas Perl 6 was vapourware for a long time. And Python 2.7 and 3.x continue to develop similar libraries in parallel.

 Worse still for Perl 6. Its first implementation was written in Haskell, which got Perl programmers thinking about Haskell. After which there were fewer Perl programmers.

So I don't think that Python programmers are going to fall out through the gap between 2.x and 3.x.

Still, it's a regrettable confusion. I suspect Python will continue with people recognising that it comes in two different "dialects" much as people accepted that there were different dialects of BASIC. And eventually one will just quietly die.

See question on Quora

Posted on 14 June 2014

What does one mean by 'elegant' code?

It's very closely related to elegance in mathematics.

Elegant code is simple, gives you some new insight and is generally composable and modular. These qualities, although they may look almost arbitrary, are actually deeply related, practically different facets of the same underlying idea.


The biggest one, perhaps, is simplicity. But remember: simple is not the same thing as easy. Just because some code is very simple does not mean it is easy for you to understand. Easiness is relative; simplicity is absolute. 

This is especially relevant for Haskell: often, the most elegant Haskell code comes from simplifying a problem down to a well-known, universal abstraction, often borrowed from math. If you're not familiar with the abstraction, you might not understand the code. It might take a while to get it. But it is still simple.

Simple code like this is also often concise, but this is a matter of correlation, not causation. It goes in one direction: most elegant code is concise, but much concise code is not elegant.

One way of thinking about simplicity is that there are fewer "moving parts", fewer places to make mistakes. This is why many of Haskell's abstractions are so valuable—they restrict what you can possibly do, precluding common errors and shrinking the search space.

Consider the difference between mapping over a list and using a for-loop: with the loop, you could mess up the indexing, have an off-by-one error or even be doing something completely different like iterating over multiple lists at once or just repeating something n times. With a map, there's only one possible thing you can be doing: transforming a list. Much simpler! It leaves you with fewer places to make a mistake and code that's easier to read at a glance, since you immediately know the "shape" of the code when you see

In fact, that's probably my favorite test for simplicity: given that I'm familiar with the relevant abstractions and idioms, how easy is the code to read at a glance? Code is read more often than it's written, but it's skimmed even more often than it's read. That makes the ability to quickly get the gist of an expression—without having to understand all the details—incredibly useful.


Another thing that elegant code does is give you a new insight on its domain.

Sometimes, this is a surprising connection between two things that seemed disparate. Sometimes it's a new way of thinking about the problem. Sometimes its a neat idiom that captures a pattern that is normally awkward. Almost always, it's an idea that you can apply to other code or a common pattern you've already seen elsewhere.

Beyond the immediately practical reasons, mostly illustrated in the "simplicity" section, this is why I'm so drawn to elegant code:  it's the best way to learn new things. And these things, thanks to their simplicity and generality, tend to be pretty deep. Not just pointless details.

Elegant code also displays the essence of the problem its solving. It's a clear reflection of the deeper structure underlying either the solution or the problem space, not just something that happened to work. If your problem has some sort of symmetry, for example, elegant code will somehow show or take advantage of it. This is why that QuickSort example—which, unfortunately, has some problems of its own—gets trotted out so often. It does a marvellous job of reflecting the structure, and especially the symmetry, of QuickSort which the imperative version largely obscures in implementation detail. The key line
quicksort greater
reflects the shape of the resulting list.


The final characteristic of elegant code, especially elegant functional code, is composability and modularity. It does a great job of finding the natural stress lines in a problem and breaking it into multiple pieces. In some ways, this is just the same point all over: elegant code gets at the structure of what it's doing.

Really elegant code combines this with giving you a new insight and letting you split a problem into two parts that you thought inseparable. This is where laziness really shines, coincidentally.

A great such example is splitting certain algorithms into two phases: constructing a large data structure and then collapsing it. Just think of heapsort: build a heap then read elements out of it. That particular algorithm is elegant on its own, and is pretty easy to implement directly in two parts. For many other algorithms, the only way to separate them and maintain the same asymptotic bounds is to construct and fold the data structure lazily.

Conal Elliott has a great talk about this which is well worth a look. It includes some specific examples of splitting up algorithms that seem inseparable into a fold and an unfold—most of which only work lazily.

I think modularity is one of the best ways to avoid bugs and, to illustrate, I'm just going to reuse the same pictures. The first represents code that's less modular; the second represents code that's more modular. You can see why I'd find the second one more elegant!

Imagine these graphs to be parts of your code with actual, or potential, interconnections between them. If all your code is in one big ball, then every part could potentially depend on every other part; if you manage to split it into two modules with clear module boundaries, the total number of possible interconnections goes way down.
Not very modular, pretty complex—not very elegant.

Simpler and more elegant.

An Example

But that was all pretty abstract. So let me give you an example that captures all of these ideas and neatly illustrates elegance.

Lets say we have a bunch of records containing book metadata:
data Book = { author, title :: String
            , date :: Date
            {- ... -}

We want to sort our book collection, first by author, then by title, then by date. Here's the really elegant way to do it:
sortBy (comparing author <> comparing title <> comparing date)

We can use
to turn each field into a comparison function of type
Book -> Book -> Ordering
and then use the monoid operator
to combine these comparison functions.

It does exactly what you expect it to—but if you're not familiar with monoids and the
type, you might not know why it does what you expect.

On the other hand, there is the really explicit version which replaces each
with pattern-matching on
. To somebody who's not familiar with the relevant abstractions, this might be easier to read—but it's also more complex and noisy. Less elegant.

This example is simple because it neatly abstracts over all the plumbing needed to combine the comparison functions. It's very easy to tell, at a glance, exactly which fields we're sorting by and with what priorities.

It's insightful because it takes advantage of the natural way to combine
values—the way they form a monoid. Moreover, going from the
monoid to the
Book -> Book -> Ordering
monoid is actually also free—if we know how to combine any type
, we know how to combine functions
a -> o
. So the abstraction that hid the plumbing? We got most of that for free, from libraries that are not specific to Ordering at all!

Finally, this version is definitely more modular and composable than the alternatives. It's very easy to mix and match different comparison functions with this pattern. We can trivially extract parts of them to be their own functions. It's very easy to refactor. All good things.

Hopefully that's a nice illustration of what people mean by elegant and why it comes up often in languages like Haskell.

See question on Quora

Posted on 25 May 2014

Do I count as spoiled if I'm starting to find Python ugly?

Not at all. Python is very popular, but it isn't a particularly well designed language. Honestly, it's woefully overrated. Depending on who's using it, it got popular because, for high-level tasks, it's better than Java or better than Perl or even just better than C++: not exactly a high bar to clear.

Jesse Tov wrote a great post about Python's design issues, which is well worth a read: Jesse Tov's answer to What are the main weaknesses of Python as a programming language?.

See question on Quora

Posted on 3 May 2014

How hard is it to learn Java if I already know how to program in Python?

Java in my opinion has a more explicit and easier to learn syntax.  No slices.  No implicit looping.  Not many operators at all.  Anonymous classes (similar to lambda functions) are maybe the most obscure syntactical feature, but it's not hard to figure those out.  For the most part it's just scalars, objects, and methods.

I would definitely say its easier to learn Java than Python.

See question on Quora

Posted on 16 April 2014

How do I become a data scientist?

Here are some amazing and completely free resources online that you can use to teach yourself data science.

Besides this page, I would highly recommend the Quora Data Science FAQ as your comprehensive guide to data science! It includes resources similar to this one, as well as advice on preparing for data science interviews.

 Bookmark this page and the FAQ to refer back often!

Fulfill your prerequisites

Before you begin, you need Multivariable Calculus, Linear Algebra, and Python.

If your math background is up to multivariable calculus and linear algebra, you'll  have enough background to understand almost all of the probability / statistics / machine learning for the job.

Multivariate Calculus:
Numerical Linear Algebra / Computational Linear Algebra / Matrix Algebra: Linear Algebra, Coursera (starts 2/2/2015)

Multivariate calculus is useful for some parts of machine learning and a lot of probability. Linear / Matrix algebra is absolutely necessary for a lot of concepts in machine learning.

You also need some programming background to begin, preferably in Python. Most other things on this guide can be learned on the job (like random forests, pandas, A/B testing), but you can't get away without knowing how to program!

Python is the most important language for a data scientist to learn. Check out
For some reasoning behind that.

To learn Python, check out How do I learn Python?
For general advice on learning how to program, check out How do I learn to code?

If you're currently in school, take statistics and computer science classes. Check out What classes should I take if I want to become a data scientist?

Plug Yourself Into the Community

Check out Meetup to find some that interest you! Attend an interesting talk, learn about data science live, and meet data scientists and other aspirational data scientists!

Start reading data science blogs and following influential data scientists!

Setup your tools

  • Install Python, iPython, and related libraries (guide)
  • Install R and RStudio (I would say that R is the second most important language. It's good to know both Python and R)
  • Install Sublime Text

Learn to use your tools

Learn Probability and Statistics

Be sure to go through a course that involves heavy application in R or Python.

Complete Harvard's Data Science Course

This course is developed in part by a fellow Quora user, Professor Joe Blitzstein. Note that I recommend completing the 2013 version of the class instead of the 2014 version.

Intro to the class

Lectures and Slides



Do most of Kaggle's Getting Started and Playground Competitions

I would NOT recommend doing any of the prize-money competitions. They usually have datasets that are too large, complicated, or annoying, and are not good for learning (

Start by learning scikit-learn, playing around, reading through tutorials and forums at Data Science London + Scikit-learn for a simple, synthetic, binary classification task.

Next, play around some more and check out the tutorials for Titanic: Machine Learning from Disaster with a slightly more complicated binary classification task (with categorical variables, missing values, etc.)

Afterwards, try some multi-class classification with Forest Cover Type Prediction.

Now, try a regression task Bike Sharing Demand that involves incorporating timestamps.

Try out some natural language processing with Sentiment Analysis on Movie Reviews

Finally, try out any of the other knowledge-based competitions that interest you!

Learn Some Data Science Electives

Do a Capstone Product / Side Project

Use your new data science and software engineering skills to build something that will make other people say wow! This can be a website, new way of looking at a dataset, cool visualization, or anything!

Code in Public

Create public github respositories, make a blog, and post your work, side projects, Kaggle solutions, insights, and thoughts! This helps you gain visibility, build a portfolio for your resume, and connect with other people working on the same tasks.

Get a Data Science Internship or Job

Check out What is the Data Science topic FAQ? for more discussion on internships, jobs, and data science interview processes!

Book Recommendations

These three books are available as free pdfs at:

Check out more specific versions of this question:

Think like a Data Scientist

In addition to the concrete steps I listed above to develop the skillset of a data scientist, I include seven challenges below so you can learn to think like a data scientist and develop the right attitude to become one.

(1) Satiate your curiosity through data

As a data scientist you write your own questions and answers. Data scientists are naturally curious about the data that they're looking at, and are creative with ways to approach and solve whatever problem needs to be solved.

Much of data science is not the analysis itself, but discovering an interesting question and figuring out how to answer it.

Here are two great examples:

Challenge: Think of a problem or topic you're interested in and answer it with data!

(2) Read news with a skeptical eye

Much of the contribution of a data scientist (and why it's really hard to replace a data scientist with a machine), is that a data scientist will tell you what's important and what's spurious. This persistent skepticism is healthy in all sciences, and is especially necessarily in a fast-paced environment where it's too easy to let a spurious result be misinterpreted.

You can adopt this mindset yourself by reading news with a critical eye. Many news articles have inherently flawed main premises. Try these two articles. Sample answers are available in the comments.

Easier: You Love Your iPhone. Literally.
Harder: Who predicted Russia’s military intervention?

Challenge: Do this every day when you encounter a news article. Comment on the article and point out the flaws.

(3) See data as a tool to improve consumer products

Visit a consumer internet product (probably that you know doesn't do extensive A/B testing already), and then think about their main funnel. Do they have a checkout funnel? Do they have a signup funnel? Do they have a virility mechanism? Do they have an engagement funnel?

Go through the funnel multiple times and hypothesize about different ways it could do better to increase a core metric (conversion rate, shares, signups, etc.). Design an experiment to verify if your suggested change can actually change the core metric.

Challenge: Share it with the feedback email for the consumer internet site!

(4) Think like a Bayesian

To think like a Bayesian, avoid the Base rate fallacy. This means to form new beliefs you must incorporate both newly observed information AND prior information formed through intuition and experience.

Checking your dashboard, user engagement numbers are significantly down today. Which of the following is most likely?

1. Users are suddenly less engaged
2. Feature of site broke
3. Logging feature broke

Even though explanation #1 completely explains the drop, #2 and #3 should be more likely because they have a much higher prior probability.

You're in senior management at Tesla, and five of Tesla's Model S's have caught fire in the last five months. Which is more likely?

1. Manufacturing quality has decreased and Teslas should now be deemed unsafe.
2. Safety has not changed and fires in Tesla Model S's are still much rarer than their counterparts in gasoline cars.

While #1 is an easy explanation (and great for media coverage), your prior should be strong on #2 because of your regular quality testing. However, you should still be seeking information that can update your beliefs on #1 versus #2 (and still find ways to improve safety). Question for thought: what information should you seek?

Challenge: Identify the last time you committed the Base rate fallacy. Avoid committing the fallacy from now on.

(5) Know the limitations of your tools

“Knowledge is knowing that a tomato is a fruit, wisdom is not putting it in a fruit salad.” - Miles Kington

Knowledge is knowing how to perform a ordinary linear regression, wisdom is realizing how rare it applies cleanly in practice.

Knowledge is knowing five different variations of K-means clustering, wisdom is realizing how rarely actual data can be cleanly clustered, and how poorly K-means clustering can work with too many features.

Knowledge is knowing a vast range of sophisticated techniques, but wisdom is being able to choose the one that will provide the most amount of impact for the company in a reasonable amount of time.

You may develop a vast range of tools while you go through your Coursera or EdX courses, but your toolbox is not useful until you know which tools to use.

Challenge: Apply several tools to a real dataset and discover the tradeoffs and limitations of each tools. Which tools worked best, and can you figure out why?

(6) Teach a complicated concept

How does Richard Feynman distinguish which concepts he understands and which concepts he doesn't?

Feynman was a truly great teacher. He prided himself on being able to devise ways to explain even the most profound ideas to beginning students. Once, I said to him, "Dick, explain to me, so that I can understand it, why spin one-half particles obey Fermi-Dirac statistics." Sizing up his audience perfectly, Feynman said, "I'll prepare a freshman lecture on it." But he came back a few days later to say, "I couldn't do it. I couldn't reduce it to the freshman level. That means we don't really understand it." - David L. Goodstein, Feynman's Lost Lecture: The Motion of Planets Around the Sun

What distinguished Richard Feynman was his ability to distill complex concepts into comprehendible ideas. Similarly, what distinguishes top data scientists is their ability to cogently share their ideas and explain their analyses.

Check out Edwin Chen's answers to these questions for examples of cogently-explained technical concepts:

Challenge: Teach a technical concept to a friend or on a public forum, like Quora or YouTube.

(7) Convince others about what's important

Perhaps even more important than a data scientist's ability to explain their analysis is their ability to communicate the value and potential impact of the actionable insights.

Certain tasks of data science will be commoditized as data science tools become better and better. New tools will make obsolete certain tasks such as writing dashboards, unnecessary data wrangling, and even specific kinds of predictive modeling.

However, the need for a data scientist to extract out and communicate what's important will never be made obsolete. With increasing amounts of data and potential insights, companies will always need data scientists (or people in data science-like roles), to triage all that can be done and prioritize tasks based on impact.

The data scientist's role in the company is the serve as the ambassador between the data and the company. The success of a data scientist is measured by how well he/she can tell a story and make an impact. Every other skill is amplified by this ability.

Challenge: Tell a story with statistics. Communicate the important findings in a dataset. Make a convincing presentation that your audience cares about.

If you liked this answer, please consider:

  1. Clicking "Want Answers" to What is the Data Science topic FAQ? and this question to get notifications of updates!
  2. Following me (William Chen) and my Quora blog at Storytelling with Statistics to get notified when I post more content like this!
  3. Sharing this page with your friends and followers via facebook / twitter / linkedin / g+ etc.!

See question on Quora

Posted on 18 March 2014

What are some common mistakes that could slow down one's Python scripts?

Using loops instead of list comprehensions.

Quick example:
def forloop():
    L = []
    for i in xrange(100):

def listcomp():
    L = [i**2 for i in xrange(100)]

if __name__ == '__main__':
    import timeit
    print 'for-loop =', timeit.timeit("forloop()", setup="from __main__ import forloop")
    print 'list comp =', timeit.timeit("listcomp()", setup="from __main__ import listcomp")

for-loop = 10.033878088
list comp = 6.61429381371

See comments for details on e.g.

See question on Quora

Posted on 27 February 2014

Can a high-level language like Python be compiled thereby making it as fast as C?

Yes it can. In fact, many high-level languages are compiled like that including Common Lisp, Scheme, OCaml and Haskell.

But you have to keep something in mind: C is not all that fast. Rather, C is easy to optimize.

This is an important difference: if you just write naïve C code, it won't be fast. It won't be terribly slow--certainly not as slow as Python--but it won't be anywhere close to the speed of optimized C.

C doesn't magically make your code fast. Rather, C exposes enough low-level details to make optimizing possible. It takes an expert in performance--one who is constantly thinking about cache behavior, register blocking, memory layout and so on--to write truly fast C code. And C doesn't even help all that much; it just makes all this possible in the first place.

For example, you could just compile your high-level program to C directly. But just because you're outputting C does not mean you're anywhere near the speed C can offer. And, in fact, this is exactly what happens with compilers like CHICKEN Scheme: they turn high-level code into C, but the result isn't nearly as good as handwritten C can be.

To actually rival C, your compiler would have to not just compile down to assembly but also optimize really cleverly. You would have to compete with both the optimizations C compilers already perform and the hand-optimization of experts. And, right now, we don't have any systems that can really do this in the general case.

There have been research projects like Stalin Scheme (it brutally optimizes) which could beat even hand-written C in some cases. But this comes with significant compiler complexity, really long compile times and prevents separate compilation--enough problems to basically kill Stalin. There have also been projects that can generate really fast code for specific tasks or really short programs. But nothing general.

So: yes, you can compile high-level languages. And, in one sense, they would be as fast as C. But they will still not be as optimizable as C, so hand-written C will still trounce your high-level programs.

See question on Quora

Posted on 2 January 2014

How do I become a data scientist?

Become a Data Scientist by Doing Data Science

The best way to become a data scientist is to learn - and do - data science. There are a many excellent courses and tools available online that can help you get there.

Here is an incredible list of resources compiled by Jonathan Dinu, Co-founder of Zipfian Academy, which trains data scientists and data engineers in San Francisco via immersive programs, fellowships, and workshops.

EDIT: I've had several requests for a permalink to this answer. See here: A Practical Intro to Data Science from Zipfian Academy

EDIT2: See also: "How to Become a Data Scientist" on SlideShare:

Python is a great programming language of choice for aspiring data scientists due to its general purpose applicability, a gentle (or firm) learning curve, and — perhaps the most compelling reason — the rich ecosystem of resources and libraries actively used by the scientific community.

When learning a new language in a new domain, it helps immensely to have an interactive environment to explore and to receive immediate feedback. IPython provides an interactive REPL which also allows you to integrate a wide variety of frameworks (including R) into your Python programs.

Data scientists are better at software engineering than statisticians and better at statistics than any software engineer. As such, statistical inference underpins much of the theory behind data analysis and a solid foundation of statistical methods and probability serves as a stepping stone into the world of data science.

edX: Introduction to Statistics: Descriptive Statistics: A basic introductory statistics course.

Coursera Statistics, Making Sense of Data: A applied Statistics course that teaches the complete pipeline of statistical analysis

MIT: Statistical Thinking and Data Analysis: Introduction to probability, sampling, regression, common distributions, and inference.

While R is the de facto standard for performing statistical analysis, it has quite a high learning curve and there are other areas of data science for which it is not well suited. To avoid learning a new language for a specific problem domain, we recommend trying to perform the exercises of these courses with Python and its numerous statistical libraries. You will find that much of the functionality of R can be replicated with NumPy, @SciPy, @Matplotlib, and @Python Data Analysis Library

Well-written books can be a great reference (and supplement) to these courses, and also provide a more independent learning experience. These may be useful if you already have some knowledge of the subject or just need to fill in some gaps in your understanding:

O'Reilly Think Stats: An Introduction to Probability and Statistics for Python programmers

Introduction to Probability: Textbook for Berkeley’s Stats 134 class, an introductory treatment of probability with complementary exercises.

Berkeley Lecture Notes, Introduction to Probability: Compiled lecture notes of above textbook, complete with exercises.

OpenIntro: Statistics: Introductory text book with supplementary exercises and labs in an online portal.

Think Bayes: An simple introduction to Bayesian Statistics with Python code examples.

A solid base of Computer Science and algorithms is essential for an aspiring data scientist. Luckily there are a wealth of great resources online, and machine learning is one of the more lucrative (and advanced) skills of a data scientist.

Coursera Machine Learning: Stanford’s famous machine learning course taught by Andrew Ng.

Coursera: Computational Methods for Data Analysis: Statistical methods and data analysis applied to physical, engineering, and biological sciences.

MIT Data Mining: An introduction to the techniques of data mining and how to apply ML algorithms to garner insights.

Edx: Introduction to Artificial Intelligence: Introduction to Artificial Intelligence: The first half of Berkeley’s popular AI course that teaches you to build autonomous agents to efficiently make decisions in stochastic and adversarial settings.

Introduction to Computer Science and Programming: MIT’s introductory course to the theory and application of Computer Science.

UCI: A First Encounter with Machine Learning: An introduction to machine learning concepts focusing on the intuition and explanation behind why they work.

A Programmer's Guide to Data Mining: A web based book complete with code samples (in Python) and exercises.

Data Structures and Algorithms with Object-Oriented Design Patterns in Python: An introduction to computer science with code examples in Python — covers algorithm analysis, data structures, sorting algorithms, and object oriented design.

An Introduction to Data Mining: An interactive Decision Tree guide (with hyperlinked lectures) to learning data mining and ML.

Elements of Statistical Learning: One of the most comprehensive treatments of data mining and ML, often used as a university textbook.

Stanford: An Introduction to Information Retrieval: Textbook from a Stanford course on NLP and information retrieval with sections on text classification, clustering, indexing, and web crawling.

One of the most under-appreciated aspects of data science is the cleaning and munging of data that often represents the most significant time sink during analysis. While there is never a silver bullet for such a problem, knowing the right tools, techniques, and approaches can help minimize time spent wrangling data.

School of Data: A Gentle Introduction to Cleaning Data: A hands on approach to learning to clean data, with plenty of exercises and web resources.

Predictive Analytics: Data Preparation: An introduction to the concepts and techniques of sampling data, accounting for erroneous values, and manipulating the data to transform it into acceptable formats.

OpenRefine (formerly Google Refine): A powerful tool for working with messy data, cleaning, transforming, extending it with web services, and linking to databases. Think Excel on steroids.

Data Wrangler: Stanford research project that provides an interactive tool for data cleaning and transformation.

sed - an Introduction and Tutorial: “The ultimate stream editor,” used to process files with regular expressions often used for substitution.

awk - An Introduction and Tutorial: “Another cornerstone of UNIX shell programming” — used for processing rows and columns of information.

The most insightful data analysis is useless unless you can effectively communicate your results. The art of visualization has a long history, and while being one of the most qualitative aspects of data science its methods and tools are well documented.

UC Berkeley Visualization: Graduate class on the techniques and algorithms for creating effective visualizations.

Rice University Data Visualization: A treatment of data visualization and how to meaningfully present information from the perspective of Statistics.

Harvard University Introduction to Computing, Modeling, and Visualization: Connects the concepts of computing with data to the process of interactively visualizing results.

Tufte: The Visual Display of Quantitative Information: Not freely available, but perhaps the most influential text for the subject of data visualization. A classic that defined the field.

School of Data: From Data to Diagrams: A gentle introduction to plotting and charting data, with exercises.

Predictive Analytics: Overview and Data Visualization: An introduction to the process of predictive modeling, and a treatment of the visualization of its results.

D3.js: Data-Driven Documents — Declarative manipulation of DOM elements with data dependent functions (with Python port).

Vega: A visualization grammar built on top of D3 for declarative visualizations in JSON. Released by the dream team at Trifacta, it provides a higher level abstraction than D3 for creating “ or SVG based graphics.

Rickshaw: A charting library built on top of D3 with a focus on interactive time series graphs.

Modest Maps: A lightweight library with a simple interface for working with maps in the browser (with ports to multiple languages).

Chart.js: Very simple (only six charts) HTML5 “ based plotting library with beautiful styling and animation.

When you start operating with data at the scale of the web (or greater), the fundamental approach and process of analysis must change. To combat the ever increasing amount of data, Google developed the MapReduce paradigm. This programming model has become the de facto standard for large scale batch processing since the release of Apache Hadoop in 2007, the open-source MapReduce framework.

UC Berkeley: Analyzing Big Data with Twitter: A course — taught in close collaboration with Twitter — that focuses on the tools and algorithms for data analysis as applied to Twitter microblog data (with project based curriculum).

Coursera: Web Intelligence and Big Data: An introduction to dealing with large quantities of data from the web; how the tools and techniques for acquiring, manipulating, querying, and analyzing data change at scale.

CMU: Machine Learning with Large Datasets: A course on scaling machine learning algorithms on Hadoop to handle massive datasets.

U of Chicago: Large Scale Learning: A treatment of handling large datasets through dimensionality reduction, classification, feature parametrization, and efficient data structures.

UC Berkeley: Scalable Machine Learning: A broad introduction to the systems, algorithms, models, and optimizations necessary at scale.

Mining Massive Datasets: Stanford course resources on large scale machine learning and MapReduce with accompanying book.

Data-Intensive Text Processing with MapReduce: An introduction to algorithms for the indexing and processing of text that teaches you to “think in MapReduce.”

Hadoop: The Definitive Guide: The most thorough treatment of the Hadoop framework, a great tutorial and reference alike.

Programming Pig: An introduction to the Pig framework for programming data flows on Hadoop.

Data Science is an inherently multidisciplinary field that requires a myriad of skills to be a proficient practitioner. The necessary curriculum has not fit into traditional course offerings, but as awareness of the need for individuals who have such abilities is growing, we are seeing universities and private companies creating custom classes.

UC Berkeley: Introduction to Data Science: A course taught by Jeff Hammerbacher and Mike Franklin that highlights each of the varied skills that a Data Scientist must be proficient with.

How to Process, Analyze, and Visualize Data: A lab oriented course that teaches you the entire pipeline of data science; from acquiring datasets and analyzing them at scale to effectively visualizing the results.

Coursera: Introduction to Data Science: A tour of the basic techniques for Data Science including SQL and NoSQL databases, MapReduce on Hadoop, ML algorithms, and data visualization.

Columbia: Introduction to Data Science: A very comprehensive course that covers all aspects of data science, with an humanistic treatment of the field.

Columbia: Applied Data Science (with book): Another Columbia course — teaches applied software development fundamentals using real data, targeted towards people with mathematical backgrounds.

Coursera: Data Analysis (with notes and lectures): An applied statistics course that covers algorithms and techniques for analyzing data and interpreting the results to communicate your findings.

An Introduction to Data Science: The companion textbook to Syracuse University’s flagship course for their new Data Science program.

Kaggle: Getting Started With Python For Data Science: A guided tour of setting up a development environment, an introduction to making your first competition submission, and validating your results.

Data science is infinitely complex field and this is just the beginning.

 If you want to get your hands dirty and gain experience working with these tools in a collaborative environment, check out our programs at

There's also a great SlideShare summarizing these skills: How to Become a Data Scientist

You're also invited to connect with us on Twitter @zipfianacademy and let us know if you want to learn more about any of these topics.

See question on Quora

Posted on 7 November 2013

What are the ways for parallelizing python and numpy codes?

NumPy by default provides some Python wrappers for underlying C libraries like BLAS and LAPACK (or ATLAS). If you want multithreading, I think you can build NumPy against different libraries (like MKL). This guy did some benchmarking:

Before you look at parallel implementations, you should look at some more optimization (incase you are not aware of it): PerformancePython - and PerformanceTips -

What you might want are Python wrappers to a high-performance linear algebra package written in C or Fortran. This exists for distributed memory systems in packages like PETSc, where they have petsc4py. If you need eigensolvers, there is also SLEPc and slepc4py (both rely on mpi4py). It takes a bit of messing around to set them up though.

In addition to PETSc, there is also Trilinos and its corresponding Python wrappers, PyTrilinos: PyTrilinos - Home. If you are looking at GPUs, there is PyCUDA: Andreas Klöckner's web page and Welcome to PyCUDA’s documentation!

Here is a really good presentation of high-performance Python:

Unfortunately it is not as easy as it is in MATLAB, but fortunately if you put in some time, you will get massive speedups.

See question on Quora

Posted on 7 October 2013

I want to learn C or C++ programming language. I do not know anything about either, or programming, which should I learn first? Or are there other better alternatives like Java, or Python?

Gratuitous analogy time:
let's pretend you want to be a carpenter, and build a wooden house instead.

C is a hammer and some nails. You can build anything in principle, but as a beginner, it will take an awful lot of attempts to make anything bigger than a dog house.

C++ is a chainsaw and a nail-gun. Before making anything complicated, you must figure out how to make sure that you don't hurt yourself with the tools. While the tools are certainly powerful, my hat comes off to you if all your limbs are still intact when you can finally move into your new house.

Java is a corporate building contractor. You get to draw your own extravagantly detailed blueprints, but exactly how it translates into nails and boards is ultimately outside your control.

Python is a hired architect with a bunch of friends. You can tell it tersely what you want, and your house becomes a reality through the employment of a motley crew of freelancers who will have their own preferences of whether to use hammers or nail-guns. You can influence their work if you start asking questions, but you don't necessarily have to touch any of it.

C is good for programs that more-or-less directly access the operating system and hardware, C++ is good for larger applications that would be in C if it didn't take so long to write it all explicitly, Java is good for programs that should not be concerned with what type of computer they are running on, and Python is good for producing something that works very quickly, for further refinement if need be.

If I were teaching programming to beginners, I'd go with Python out of this lot, because although it does hide a bunch of things from the programmer, it also leaves the details as an option to dive in and figure out what it does exactly.

That's a good starting point for learning, I think.

See question on Quora

Posted on 13 August 2013

How do I become a data scientist?

There is a really comprehensive and cool visualization of the path to follow to become a data scientist.

The infographic shows the necessary skills to become a good data scientist and mapped out the learning path of a data scientist according to 10 different domains.

Edit: The image came from the article, Becoming a Data Scientist - Curriculum via Metromap - Pragmatic Perspectives, by Swami Chandrasekaran.

See question on Quora

Posted on 11 August 2013

What are some cool Python tricks?

You can use a dictionary to store a switch, so that you can use it repeatedly and keep your code clean (it also prevents you having to have the overhead of defining the switch each time).

calculator = {
'plus': lambda x, y: x + y,
'minus': lambda x, y: x - y
>> 6

See question on Quora

Posted on 7 August 2013

Is Python the most important programming language to learn for aspiring data scientists & data miners?

For aspiring Data Scientists, Python is probably the most important language to learn because of its rich ecosystem.

Python's major advantage is its breadth. For example, R can run Machine Learning algorithms on a preprocessed dataset, but Python is much better at processing the data. Pandas is an incredibly useful library that can essentially do everything SQL does and more. matplotlib lets you create useful visualizations to quickly understand your data.

In terms of algorithm availability, you can get plenty of algorithms out of the box with scikit-learn. And if you want to customize every detail of your models, Python has Theano. In addition, Theano is easily configured to run on the GPU, which gives you a cheap and easy way to get much higher speeds without having to change a single line of code or delve into performance details.

I've used R, matlab, Octave, Python, SAS, and even Microsoft Analysis Services, and Python is the clear winner in my book.

See question on Quora

Posted on 26 April 2013

Which is better, PHP or Python? Why?

Python will make you a better programmer over time, because the language is consistent, borrows goods ideas from functional programming, is clean, easy to read, has a lots of clever and useful constructs (decorators, iterators, list comprehensions, ...), has first-class functions, comes fully loaded with any library you've ever dreamed of, has a great community, clear and respected conventions and philosophy (look at PEP8), etc, etc.

Try out Flask to start getting results right away: it's easy enough to get a website up an running in a matter of minutes. Development environment is very easy to setup, but prod env may be slightly harder to get right, according to your sysop skills. Basic or free hosting is on the php side though.

Jython, a python interpreter based on the jvm is available and may be something to look at: I believe it's commonly use to offer scripting capabilities to java  applications.

Learning curve isn't as steep as it could seems: indentation matters, no curly braces. Beyond that, you will feel right at home but with more power and expressiveness than the half-backed php.

Just as a reminder : php, born Personal Home Page, was a set of simple macros written to basically get a dynamic visitor counter and grew from this legacy. The old days are way behind, but you can still feel the language organic growth  everywhere (inconsistencies, half-backed object model, half-backed closures (it's really just anonymous functions landed a few months (years?) ago), amateurish community, etc).
If you still want to or have to work with php, have a look at Fabien Potencier's work: he's the lead dev of successful frameworks and libs such as symphony, twig, composer, and some other top-notch stuff.

Anyway, just go with python :)

See question on Quora

Posted on 10 April 2013

Which is better, PHP or Python? Why?

My vote is for Python.  Here are just a few reasons based on my experience with both...

- Arrays: Most languages make a distinction between arrays and hashes.  PHP uses the same data type for both, which forces a programmer to jump through a lot of hoops testing the sanity of their array structures.  Have a look at "array" in the PHP documentation.  There are way too many array functions.

- Objects: In Python everything is an object, so string methods are accessible through string objects, array (or list) methods are accessible through the list object, etc., e.g. Python: my_list.pop() VS. PHP: array_pop($i_hop_this_is_indeed_an_array)

- Errors/Typos: PHP is hard to debug because it lets nearly anything fly with only notices or warnings.  For example, you can spend countless hours adding print statements throughout your PHP code only to discover a simple typo in a variable name.  Python will stacktrace if you try to access an uninitialized variable, or accidentally add an integer and a string.

- Triple equal:  PHP WTF?  Need I say more?  :-)

- Conciseness: I personally feel like I can accomplish more with less code in Python, and it's easier to read.  It's not just the fact that Python doesn't require all the dollar signs, curly braces and other syntactic cruft, Python is simply more concise.

- Imports: IMHO, PHP's method of including code is confusing (include(), include_once(), require(), require_once()) and the new namespacing stuff is even worse.  Python uses import statements like Java... simple and clean.

This list could go on forever.  ;)

See question on Quora

Posted on 9 April 2013

Which is better, PHP or Python? Why?

If you don't know either language and need to crank out a quick simple one-off web site where maintainability is much less important than time to first working version: PHP.

If raw performance of the runtime environment is a critical factor: Maybe PHP, thanks to the existence of Facebook's HipHop for PHP which as far as I know isn't matched in raw performance by anything on the Python side. (Someone please tell me if I'm wrong.) Of course, few web apps are going to be CPU-bound.

If neither of those two criteria apply: Python.

Hey, someone needed to take a stab at a contrary answer.

See question on Quora

Posted on 8 April 2013

Why is there a recent trend away from PHP towards Python and Ruby on Rails?

PHP was the first open source language designed for the web and reached maturity around 1999 with the release of PHP4.  Before PHP there was only perl (free, but a general purpose scripting language, kind of hard to learn) and ASP (which was not free and required an enterprise-level budget to run.)  So PHP had a head start of about 6 years over Ruby andy Python.  (These languages existed since the mid 1990's but had no web frameworks written for them).

Despite PHP's many shortcomings (lack of true object orientation, weak exception handling, no lambdas, and as others have mentioned, being essentially a huge flat namespace of inconsistently-named functions) it won by its ubiquity.  It was free and even the cheapest commodity web hosting providers were offering PHP by 2002 or 2003, so it had a full generation in Internet years to establish itself as the common language for open source developers. 

The emergence of Rails in 2005 began to change that but it took a few years for Rails to gain mainstream acceptance.  Python followed suit with the development of the Django framework, on the same MVC pattern as Rails. 

Services like Heroku were essential in getting Ruby to the mainstream - you no longer needed to have dedicated servers or know how to compile source code to run a Ruby server - you essentially had the same consumer-level pricing for running Ruby apps that you had with PHP. 

Ruby and Python are overtaking PHP because developers tend to favor the languages - they have better abstractions and allow programmers to be more productive.  Also, the ubiquity of PHP worked against it a little because it meant that less skilled programmers could contribute code and the quality of code in PHP projects is generally of a much lower quality as a result (see WordPress plugins for example) while the Ruby and Python communities have focused on developing better coding practices like Test Driven Development.  As a result, people who use Ruby and Python are perceived as "better" programmers, and more desirable hires.  New technology-focused companies are thus more likely to start projects in Ruby and Python because of the perceived higher quality of developers, even though for most web applications, an experienced team ("experienced" being the key) using Symfony or Cake can be just as productive as a team using Rails or Django

There's always going to be a fringe language X that's favored by hackers and academics, but has no obvious business application and thus stays obscure, only to seemingly come out of nowhere years after its invention when the critical mixture of a user need and practical libraries is achieved.  Today it might be Haskell or OCAML or Scala.  It's been LISP for about 50 years now.

See question on Quora

Posted on 20 February 2013

How do I learn Python?

The easiest way to learn a programming language is to first learn the basics and then try to build something with it (learn by doing). And it's better if you are building something you are actually interested in rather than something out of a book because it will get you to think about the problem and be more meaningful.

Python is easy to learn (not much syntax), easy to read (explicit vs implicit), has a big ecosystem (more packages/libraries), is taught at universities so it's easy to find good programmers to help, and is used by many large websites/companies (e.g., Quora is programmed in Python) so it's a good language to know.

Online Python Tutorials (in order from introductory to more advanced):

  1. "A Byte of Python"
  2. Google's Into to Python Class (online) -
  3. "Dive Into Python", by Mark Pilgrim
  4. "The New Boston" Programming Python Tutorials -
  5. "Building Skills in Python", by Steven F. Lott -
  6. "Think Python: How to Think Like a Computer Scientist" -
  7. "Code Like a Pythonista: Idiomatic Python"  -
  8. OpenCourseWare: MIT 6.00 Introduction to Computer Science and Programming -
  9. MIT 6.01 Course Readings (PDF) -
  10. Google's "Understanding Python" (more advanced talk) -
  11. "A Guide to Python's Magic Methods" -
  12. "Metaclasses Demystified" -

Book to Get: "Python Cookbook", by Alex Martelli (

And if you're building something Web based, look at using the Flask Web Framework (

Flask is a modern, lightweight, and well-documented Python Web framework so you won't have to spend much time learning it or fighting with it -- you won't find yourself asking, "Will I be able to do what I want in the framework without hacking it?" Flask let's you program in Python rather than writing to the framework like you typically have to in larger, opinionated framework's like Django and Rails.

See question on Quora

Posted on 28 April 2012

Does being proficient in more than one or two programming language(s) benefit one in the long run? Why or why not?

Yes, being a "polyglot programmer" benefits one in the long run.

Other languages give you other perspectives on how to solve problems, even if you stick to one language and its platform as the backbone of your career,

For large multi-language projects, understanding other languages will improve how you work with your team, even if you don't code in the other languages yourself.

See question on Quora

Posted on 26 February 2012

Does being proficient in more than one or two programming language(s) benefit one in the long run? Why or why not?

If you are in it for the long run, you will definitely learn a proliferation of languages. It just happens. No particular point learning multiple languages, if you can identify which will get you your first job. Learn that language. Other languages will offer themselves to you, as you progress, after that.

What one learns from this, is that computers are extremely simple things, and computer languages actually equate to dialects of the same basic instructions. There's a local form of if, there's a local form of while, there's a local form of switch, and so on. Anyone who tells you different hasn't looked at enough assembler, to see the actual steps that each of these macro tasks hides from you.

The main thing is, never resent a language, or fight against it, out of principal. Every computer language (that has been used outside of academia) has had its moment in the sun. If you encounter old code, try and understand that the things it does, often does the things it does because once upon a time, say, a 50Meg hard drive was big, and 56Mbits a second was fast... And one day (if you are any good at your job) your code will be the dreadful creeping horror, no one dares turn off.

See question on Quora

Posted on 26 February 2012

I want to learn to code Python and Django (web framework). What's the best way to start for a programming newbie?

Although I think Python is a better overall language, if you just want to slap a utilitarian web interface on some backend code for internal use then PHP might be a better language to learn. It's easier to setup on the server, will run on virtually any host, and is a more out of the box solution.

As for Python/Django:

If you have never programmed before, it's definitely worth learning Python before you get to Django. Someone with experience could skip to a Django book/tutorial and pickup Python on the way - it's a simple language with very clear, easy to read and understand code.

How long it takes you to learn what you need to know is highly variable. If you are just trying to write some automation scripts to help cut down some manual labor, then you can probably go from zero to this point in a few weeks (maybe 20-30 hours). If you want to write production quality web apps using Python/Django, it's going to take longer.

Setup The Environment

First download Python if you don't have it. I prefer Linux, but your MacBook will be more than sufficient as a dev machine.

Python is in a state of limbo between the 2.7 release version and 3. While 3 is the future, it introduces some intrinsic changes which many of the popular libraries do not yet support, Django included. Your best bet is to start with 2.7 and switch to Python 3 later. Also, most of the learning material available is still written for Python 2.

You can write code in any text editor. My favorite, and an up-and-coming basic code editor is Sublime Text. It is simple, elegant, and very functional. It costs $59, but you can use it free for an unlimited amount of time (as of right now). Well worth buying though.

Many Mac developers love and swear by TextMate. It's more developed and further along than Sublime, I think. Costs $54, and has a 30-day trial.

If you get deeper into programming and want a full featured integrated development environment (IDE), then PyCharm is top notch. It costs $99 and has a yearly renewal fee for updates, but is worth it. Something like this has a much steeper learning curve than Sublime Text or TextMate, but they can save you time and keystrokes in the long run.

I'm going to assume you are familiar with working in the terminal, since you have IT experience. If not, this might be a good starting point:

Django apps can be run entirely on your own dev machine, but if you want to put it on the web to be accessed by others on your team, or from other machines you will need a host. There are some good questions on Quora about hosts, but ensure you choose one that allows Python and SSH access. I recommend finding a cheap Virtual Private Server (VPS), although this might be too steep a learning curve for someone without experience. (You say you've done a lot in the IT field, so some of this might be too basic for you, sorry).

I recommend learning and using Source Control. This helps manage your code revisions, and is particularly useful if you have more than one person working on it. I personally use Mercurial, but Git is more popular. is a good intro guide for Mercurial. looks to be good for Git, but I haven't worked through it yet.

In addition to using Source Control, you'll need a source code repository (you'll learn what this means in one of those tutorials. GitHub ( is the most popular, with BitBucket ( coming in second. You can use Git on either, but GitHub does not support Mercurial. Also, BB has better options for free accounts - unlimited free repos, whereas GitHub limits you.

You might feel overwhelmed trying to learn how to program Python, learning Django, and trying to figure out source control and a myriad of tools all at once. In my opinion it's best to get down a version control workflow early on, rather than putting it off. You'll develop good habits early on that will help you down the stretch.

Where to Learn
There are a ton of resources for learning Python, and quite a few for Django. Be sure that whatever you choose, you go with resources that consistently use either Python 2 or 3. Also, stay away from small tutorials and stick with complete references. Learning from piecemeal tutorials will leave you with fragmented knowledge, and they are usually lower quality.

Here is a list of references taken from another Quora question. The key to learning how to program, in my opinion, is to practice a lot. So do the exercises these books contain, and do more programming on your own.

Online Tutorials & Ebooks
All free

Recommended: (A higher level look at programming with Python as the tool; highly recommended if you want to be a good programmer)

Recommended: (A higher level look at programming with Python as the tool; highly recommended if you want to be a good programmer)

Sometimes having a physical book makes it easier for some people to learn. Many of the above ebooks are available in hard copy.

Dive Into Python
Think Python
Learn Python the Hard Way
A Byte of Python

How do I learn Python?

All of those are Python references. The online material available for Django is more sparse, but there are some good resources.

The Django Book is the starting point for most people:

There is, of course, the official tutorial: I found Django Book more useful. However, get very familiar with the Django docs. They are very good, and you will be spending a lot of time digging into them.

This is a highly recommended hardcopy book for learning, but I've not used it:

Prefer video? This series ought to be very good: I have not tried it yet either. There is a $25/mo fee for their service

Getting Assistance
Inevitably, when you are learning or attempting to build something, you're going to run into a brick wall at some point.

This is my workflow if I get stuck on a concept, or while programming:
Check the Documentation -> Check the Source Code -> Search Google -> Ask on StackOverflow

Asking is always a last resort, quite simply because figuring it out on my own gives more of a sense of pride and accomplishment, and I'm more likely to remember the solution.

Python Docs:
Django Docs:

See question on Quora

Posted on 15 December 2011

What are some cool Python tricks?

Create infinities
Infinity, and it's brother minus infinity, comes in handy once in a while.

my_inf = float('Inf')
99999999 > my_inf
-> False

my_neg_inf = float('-Inf')
my_neg_inf > -99999999
-> False

Intuitive comparisons
A great example of the simplicity of python syntax.

x = 2
3 > x == 1
-> False
1 < x < 3
-> True
10 < 10*x < 30 
-> True
10 < x**5 < 30 
-> False
100 < x*100 >= x**6 + 34 > x <= 2*x <5
-> True

Enumerate it
Ever wanted to find that damn index when you're inside a loop?

mylist = [4,2,42]
for i, value in enumerate(mylist):
    print i, ': ', value
-> 0: 4
-> 1: 2
-> 2: 42

Reverse it
This has grown to become a part of my morning ritual. Reverse. Anywhere. Anytime. All the time.

# Reverse the list itself:
mylist = [1,2,3]
print mylist
-> [3,2,1]
# Iterate in reverse
for element in reversed([1,2,3]): print element
-> 3
-> 2
-> 1

Ultra compact list generating
Using nested list comprehensions you can save a great deal typing, while having fun impressing the girls.

[(x**2, y**2, x**2+y**2) for x in range(1,5) for y in range(1,5) if x<=y and x%2==0]
-> [(4, 4, 8), (4, 9, 13), (4, 16, 20), (16, 16, 32)]

NB! Crazy nesting should be used with extreme caution.
Readability > Girls
-> True

Splat call
'*' is called the splat operator, and may make you smile. It automagically unpacks stuff in a function call.

def foo(a, b, c):
    print a, b, c

mydict = {'a':1, 'b':2, 'c':3}
mylist = [10, 20, 30]

-> a, b, c
-> 1, 2, 3
-> 10 20 30

The cute empty string trick
By using two single quotes ('') and a dot (.), we have access to all the builtin string functions. This can come in handy, you see.
-> 'IWantJustOneString'

The itertools module provide some useful and efficient functions for us. For example

from itertools import chain
''.join(('Hello ', 'Kiddo'))
-> 'Hello Kiddo'
''.join((x for x in chain('XCVGOHST', 'VSBNTSKFDA') if x == 'O' or x == 'K'))
-> 'OK'

When Las Vegas just isn't enough
Buy in some beer, invite a few (exactly 10) friends over, and copy/paste this sexy line of python code into your favorite interpreter. The rules are:
1. Press Enter
2. The one who gets the least stars have to CHUG CHUG CHUG!
3. Press the up arrow
4. Goto 1.

print "\n".join(str(i)+":\t"+"*"*randint(1,10) for i in range(1,11))


Make python enums
I like this enumification trick:
class PlayerRanking:
  Bolt, Green, Johnson, Mom = range(4)

-> 4

See question on Quora

Posted on 2 December 2011

What are some cool Python tricks?

List comprehensions and generator expressions

Instead of building a list with a loop:
b = []
for x in a:
    b.append(10 * x)

you can often build it much more concisely with a list comprehension:
foo([10 * x for x in a])

or, if
accepts an arbitrarily iterable (which it usually will), a generator expression:
foo(10 * x for x in a)

Python 2.7 supports dict and set comprehensions, too:
>>> {x: 10 * x for x in range(5)}
{0: 0, 1: 10, 2: 20, 3: 30, 4: 40}
>>> {10 * x for x in range(5)}
set([0, 40, 10, 20, 30])

Fun tricks with

Transposing a matrix:
>>> l = [[1, 2, 3], [4, 5, 6]]
>>> zip(*l)
[(1, 4), (2, 5), (3, 6)]

Dividing a list into groups of
>>> l = [3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5, 8]
>>> zip(*[iter(l)] * 3)
[(3, 1, 4), (1, 5, 9), (2, 6, 5), (3, 5, 8)]


>>> import this
The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!

See question on Quora

Posted on 1 December 2011

What can be done with Ruby or Python that just can't be done with PHP?

There's very little that cannot be done with PHP -- in fact it's not really an interesting question.

The more interesting question asks what things can't be done in PHP elegantly? Or simply? Or safely? Or quickly? Or shouldn't have to be done at all? These types of questions form much more useful metrics by which to compare general purpose programming languages

A good example of that last category btw is PHP's maddening tendency to output errors to the socket it's reading from. I realize that PHP was originally a templating language and that this may have made sense at one time, but having to prefix calls with @ to suppress otherwise important error messages because they go to stdout warps the mind just a bit.

See question on Quora

Posted on 25 June 2011

What are good books of advanced topics in Python?

If you haven't yet, I highly recommend reading "Think Python: How to Think Like a Computer Scientist" available free in PDF format at It is also available at Amazon for the actual book. It is different than most programming books I have read in that it focuses less on teaching a language and more on how to be a good programmer. You will learn from it, even if you're already pretty slick with Python.

You might also take a look at Google's Python Class - It is geared toward the beginner Python coder, but might have some good lessons for you still. Most of it is taught through a series of videos, covering strings, lists and sorting, dictionaries and files, regular expressions, utilities, and urllib. At the very least, take a look through some of the exercises, as they are also the kind that make you think and do an excellent job at teaching good programming habits.

(Disclaimer: I haven't watched all of those videos yet, but everything I've seen so far is excellent.)

I forgot to mention: If you are interested in some non-book high quality learning material, MIT has an open courseware series covering Computer Science and Programming at . It starts out pretty basic, but some of the advanced topics include testing and debugging, object oriented programming, encapsulation and inheritance, and then some math oriented topics like a stock market simulation, normal and exponential distributions, linear regression.

Some of the skills it teaches are things you won't find in an ordinary programming book.

Finally (I swear), Natural Language Processing With Python might be of interest to you.

See question on Quora

Posted on 19 January 2011

What are the benefits of developing in node.js versus Python?

I see three major "wins" that Node.js has over most other development environments (including Python):

  1. It's built to handle asynchronous I/O from the ground up. Other environments have async. I/O features, but Node's the first environment where it's really pervasive. In most environments you'll find only limited pieces available in async. flavors, but in Node everything (or nearly everything) is async.-only. It's actually hard to write non-async. code in Node!

    Now, there's some debate over whether async. programming is really the silver bullet some claim it is, but in my mind there's little doubt that it's a really good match to a lot of common web- and network-development problems.
  2. It's "just JavaScript." Every time I  context switch between Python on the backend and JavaScript on the frontend I waste stupid amounts of time making silly syntax errors — semicolons in my Python, missing braces in my JavaScript, etc. Some days I might switch a dozen or more times, and it really feels like I'm wasting brain cycles swapping in and out my language knowledge. Staying in a single language feels faster.
  3. It's new, so it has the benefit of being able to learn from previous languages' and environments' mistakes. Better, Node can correct those mistakes without the backwards-compatibility concerns. For example, the Node package installer, npm, is already quite a bit better than many of its equivalents. All in all, Node feels very polished and modern; it hasn't had time to accumulate the cruft other languages/environments have.

    (Sadly, I'm all too sure it will accumulate this cruft eventually. Everything new feels old eventually.)

I highly recommend that you take the time to learn Node — it'll make you a better developer, whatever you end up using. I learned Node last year, and I'm very happy I did. It's a cool piece of software, and it's a great tool to have in your toolbox.

See question on Quora

Posted on 6 January 2011

What are the advantages of Python over Ruby?

The canonical response is to post the Zen of Python:

The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!


From my personal experience:
  • Better support for scientific libraries I use in my day-to-day work: Ruby has no scipy/numpy, let alone the countless smaller modules.
  • Better readability and maintainability: this is very subjective, but I much prefer Python's syntax having used both.

These questions always come down to: Give both a good try, and pick whichever you like best. You can do pretty much everything in either language.

See question on Quora

Posted on 21 July 2010

How did Quora delineate their front-end development in terms of architecture/dev roles when it first started?

Our Python & JS code is basically in two buckets: core abstractions we created in-house and the implementation of those abstractions. That implementation aligns roughly to a traditional MVC model. So, for example, to create a typical part of the site there will be someone who is responsible for setting up the model and controllers and someone who will implement that in the front-end. Sometimes that is the same person, sometimes not.

We all pitch in across the MVC stack but I don't create or significantly modify the core abstractions and Adam doesn't typically add things to the UI. If you think of the back-end abstractions running the site like Livenode and Webnode2, Adam, Charlie and Kevin will create and extend those abstractions and I'll implement them in Python, JS and CSS to create the product. But again these aren't exact roles and I've written a few model and controller functions and Charlie and Kevin have implemented some products.

We each have our areas of expertise for sure. So, for example, I'm the only one who really ever touches CSS and Kevin, Charlie and Adam really only deal with servers, caching and data structures. Charlie also has domain expertise in JS and is main person who creates and extends our intense JS abstractions.

The breakdown was loosely:
  • Back-end (Adam D'Angelo, Charlie Cheever, Kevin Der) -- Systems Management, Core Abstractions, Python, JavaScript
  • Front-end (Charlie Cheever, Rebekah Cox) - Python, JavaScript, CSS

Since we've grown, the breakdown has become:
  • Back-end (Engineers) -- Systems Management, Core Abstractions, Python, JavaScript
  • Front-end (Designers, plus some Engineers) - Python, JavaScript, CSS

See question on Quora

Posted on 11 February 2010

What are common uses of Python decorators?

Decorators are convenient for factoring out common prologue, epilogue, and/or exception-handling code in similar functions (much like context managers and the "with" statement), such as:
  • Acquiring and releasing locks (e.g. a "@with_lock(x)" decorator)
  • Entering a database transaction (and committing if successful, or rolling back upon encountering an unhandled exception)
  • Asserting pre- or post-conditions (e.g. "@returns(int)")
  • Parsing arguments or enforcing authentication (especially in web application servers like Pylons where there's a global request and/or cookies object that might accompany formal parameters to a function)
  • Instrumentation, timing or logging, e.g. tracing every time a function runs
They are also used as shorthand to define class methods (@classmethod) and static methods (@staticmethod) in Python classes.

See question on Quora

Posted on 19 January 2010

Why is the programming language Python called Python?

"At the time when he began implementing Python, Guido van Rossum was also reading the published scripts from "Monty Python's Flying Circus" (a BBC comedy series from the seventies, in the unlikely case you didn't know). It occurred to him that he needed a name that was short, unique, and slightly mysterious, so he decided to call the language Python."

From the Python FAQ

See question on Quora

Posted on 18 December 2009 search results

When do you *NOT* use python?

Hi everyone,

We're all python fans here, and to be fair I may use it a bit more than I should. I'd like to hear other people's thoughts on which tasks they want to solve in a non-python language and which one they'd choose for that job.

Thanks in advance...

submitted by RealityShowAddict to Python
[link] [415 comments]

Posted on 16 February 2015 should stop steering web visitors away from v3 docs

$ curl --head HTTP/1.1 301 Moved Permanently Date: Fri, 06 Feb 2015 23:22:21 GMT Server: nginx Content-Type: text/html Location: 

This is another contributing factor to why v3 adoption is slow, and new users are confused. This configuration affects everything from StackOverflow links (how I first noticed it) to Google pagerank.

It's why Python3 docs don't often show up in search results. should default to v3. Or, at the very least, display a disambiguation page, a la Wikipedia.

submitted by caninestrychnine to Python
[link] [69 comments]

Posted on 6 February 2015

What do you *not* like using Python for?

Maybe sounds like a silly question, but here's the context: Been programming for ~10 years, professionally for the past 7. Matlab, C#, C++ (in decreasing order of proficiency). Per management, it looks like I'll now be getting into some Python for an upcoming project... which is cool, as with how prevalent Python seems to be, I've wanted to get my feet wet for a while.

Obviously all languages have their bounds... or at least things they do better than others. So - as I'm getting my feet wet here, does anything stand out as far as areas where Python is weak and there may be better alternatives?

submitted by therealjerseytom to Python
[link] [355 comments]

Posted on 18 October 2014

Python subreddit has largest subscriber base of any programming language subreddit (by far).

Python 80,220 (learnpython 26,519) Javascript 51,971 Java 33,445 PHP 31,699 AndroidDev 29,483 Ruby 24,433 C++ 22,920 Haskell 17,372 C# 14,983 iOS 13,823 C 11,602 Go 10,661 .NET 9,141 Lisp 8,996 Perl 8,596 Clojure 6,748 Scala 6,602 Swift 6,394 Rust 5,688 Erlang 3,793 Objective-C 3,669 Scheme 3,123 Lua 3,100 "Programming" 552,126 "Learn Programming" 155,185 "CompSci" 73,677 
submitted by RaymondWies to Python
[link] [118 comments]

Posted on 21 September 2014

What are the top 10 built-in Python modules that a new Python programmer needs to know in detail?

I'm fairly new to Python but not to Programming. With the programming languages that I've learned in the past I always see a recurring pattern — some libraries (modules) are more often used than others.

It's like the Pareto Principle (80/20 rule), which states that 80 of the outputs (or source code) will come from 20 of the inputs (language constructs/libraries).

That being said, I would like to ask the skilled Python veterans here on what they think are the top 10 most used built-in modules in a typical Python program, which a beginner Python programmer like me would benefit to know in detail?


Thanks to all that have replied :)

I found a site where I can study most of the modules that you suggested:

(Python Module of the Week)



Of course, there is no substitute for the official documentation when it comes to detailed information:

Python 2.7.*:

Python 3.4.*:

submitted by ribbon_tornado to Python
[link] [135 comments]

Posted on 24 June 2014

I made my first ever thing in Python, and am really proud of it.

I'm not sure if this is more appropriately posted to /r/learnpython, but I'm new here and new at programming and I just really felt like sharing my excitement with someone! I've been interested in computers since almost forever, and have more recently been trying to actually learn some coding (a little html and python, just the basics). After figuring out how to use Python bit I got Portable Python and sat myself down with the project of creating a 2 player game of Tic Tac Toe, and I did.

I've never really done anything on this level with a computer before and I just feel like I've opened up a door to a whole new world! I feel powerful for what I've done and I can't wait to do more.

Here is my little program if you guys are interested in seeing what an awful newb's poorly documented code looks like.

And happy coding to all! :D

submitted by DiscyD3rp to Python
[link] [127 comments]

Posted on 13 May 2014

Learning python earned me a 50% raise, and it's about to again.

(Sorry for the throwaway, but I wanted to be able to answer questions honestly without any hesitation.)

I've been in IT since I was 17 in 1999. I started off at a help desk, and worked my way up to a Systems Administrator where I was making 60k USD/yr. (I currently have only an associates degree with no plans to go back to school.) I was primarily a Windows domain/ network admin, with a few *nix boxes spread throughout. I had known windows batch scripting, and way back in the day had programmed in BASIC before the world was.

I had tossed around the idea of learning a programming language before, but when asked I'd often say "Developers' brains just work differently than mine. I'm not a coder." Programming seemed so abstract and I couldn't really wrap my head around it. I finally decided though, to try something.

It was 2010 and I had heard a lot of Ruby on Rails and thought that Ruby would be a great language to learn. I ran through the tutorial of making a polls app at least 5 times, but I just couldn't wrap my head around it. So I gave up.

One year later I heard about python. Despite all the negative talk about python while googling for "python vs ruby vs php vs ..." (GIL, speed, whitespace, duck typing, (not that I knew what ANY of that meant anyway)) I decided that I really wanted to give it a shot. I started out with codeacademy to get my feet wet, I'd tinker with idle while my wife and I would watch netflix after the kids went to bed. Then I started dreaming in code.

Have you ever had "work dreams"? The kind you have for about 2 weeks after starting a new job that's really hard? That was python for me. Being primarily in a Windows environment it was hard to find anything for python to do initially at work. My boss didn't program, and really didn't see the value in it. Then one day I found myself needing to compare a list of files. I needed to find all the files that were in one column but not in the other. I had them in excel and after working through a formula I had my answer, but I hated it. All I wanted to do was write something like--

select name from column1 where name not in (select name from column2); 

Enter python and sqlite. It probably took me about 3 hours to figure it out, but I imported a csv into a sqlite table in python so I could query it. BAM! I was hooked from then on.

Every night I would tinker, read, and play. I found tons of things to automate at work, making my time so much more effective. I loved it. I became a python evangelist. I'd like to say that my boss was impressed, but really he never came around, and it frustrated me. Fast forward a year.

I had heard about the DevOps movement and though I didn't understand it completely at the time I thought that being a Developer and Systems Admin mutant sounded like a lot of fun, and something I could really be good at.

After having a rough time with my boss one day I decided to check the local classifieds. I saw an ad for a DevOps Admin. Basically this guy needed to know hardware, networking, provisioning, something called puppet, and one of three scripting languages- ruby, bash, or python.

I looked at puppet, and after having learned about booleans and strings and syntax from python, picking it up wasn't a problem. I got hired on the spot for $90k USD. A clean 50% raise. I use python every single day. I write scripts to check if databases back up properly, if services are up, if all 1000 of my physical servers are getting their updates, to provision RAIDs, you name it. I integrate what I write into puppet, fabric, and a host of other tools that I've learned along the way.

After doing that for a little over a year now, I'm about to hire 2 guys under me as we expand and I'm moving up to $120k USD. I'm learning django for fun and am just starting into machine learning. I check out /r/python every day, you guys have been so helpful to me along my way. And if I can learn python, anybody can!!!

TL;DR I learned python in a year and got a 50% raise. 1 year later I got another 25% raise, all from python!

edit: percentages, oh math...

submitted by self_made_sysad to Python
[link] [142 comments]

Posted on 6 May 2014

What is the best part of python you wish people knew about?

I just quit my job at a major software company to be with a startup in downtown seattle and it looks like our stack is Python based. I'm new to Python but I want to learn fast; So please, let me what you like the most (or hate the most?) about python, other python developers code, etc so I can take all the good and not use the bad as I learn this new language.

Who knows, maybe you will need to maintain my code someday, so you could only be helping yourself!

Thanks in advance!

submitted by honestduane to Python
[link] [228 comments]

Posted on 16 December 2013

Eric Idle here. I've brought John Cleese, Terry Gilliam, Terry Jones and Michael Palin with me. We are Monty Python. AUA.

Hello everybody. I had so much fun last November doing my previous reddit AMA that I decided to return. I'm sure you've seen the exciting news, but here we are to confirm it, officially: Monty Python is reunited. Today is the big day and as you can imagine it's a bit of a circus round here, but we'll be on reddit from 9am for ninety minutes or so to take your questions. We'll be alternating who's answering, but everyone will be here!:

  • J0hnCleese
  • Terry_Gilliam
  • TerryJonesHere
  • _MichaelPalin


Update: We're running a little late but will be with you 10-15 minutes!

Update 2: The url for tickets - - available Monday

Update 3: Thank you for all the questions. We tried to answer as many as we could. Thanks everyone!

submitted by ericidle to IAmA
[link] [7740 comments]

Posted on 21 November 2013

What you do not like in Python?

I'm a big fun of Python! I use it every day! But there are things which are annoying, strange and so forth in Python (you really don't like it). If any, please, share your thoughts. For example:

  • built-in set type has method like symmetric_difference_update. I don't like so long methods in built-in types.
submitted by krasoffski to Python
[link] [892 comments]

Posted on 18 September 2013

Python interview questions

I'm about to go to my first Python interview and I'm compiling a list of all possible interview questions. Based on resources that I've found here, here and here I noted down the following common questions, what else should I add?


  • What are Python decorators and how would you use them?
  • How would you setup many projects where each one uses different versions of Python and third party libraries?
  • What is PEP8 and do you follow its guidelines when you're coding?
  • How are arguments passed – by reference of by value? (easy, but not that easy, I'm not sure if I can answer this clearly)
  • Do you know what list and dict comprehensions are? Can you give an example?
  • Show me three different ways of fetching every third item in the list
  • Do you know what is the difference between lists and tuples? Can you give me an example for their usage?
  • Do you know the difference between range and xrange?
  • Tell me a few differences between Python 2.x and 3.x?
  • The with statement and its usage.
  • How to avoid cyclical imports without having to resort to imports in functions?
  • what's wrong with import all?
  • Why is the GIL important? (This actually puzzles me, don't know the answer)
  • What are "special" methods (<foo>), how they work, etc
  • can you manipulate functions as first-class objects?
  • the difference between "class Foo" and "class Foo(object)"

tricky, smart ones

  • how to read a 8GB file in python?
  • what don't you like about Python?
  • can you convert ascii characters to an integer without using built in methods like string.atoi or int()? curious one

subjective ones

  • do you use tabs or spaces, which ones are better?

Ok, so should I add something else or is the list comprehensive?

submitted by dante9999 to Python
[link] [187 comments]

Posted on 19 August 2013

Python saved my ass tonight.

It's Friday night, and I'm stuck at work because Apache isn't working, and without it, I can't serve the files I need to update the embedded device I'm working on. So on a whim, I googled "python fileserver", and this little gem popped up:

python -m SimpleHTTPServer 

Running that from the directory I needed to grab files from saved me the time of debugging Apache (aka my worst nightmare), and, possibly by extension, my job. Thanks Python!

submitted by LightWolfCavalry to Python
[link] [85 comments]

Posted on 9 August 2013

Common misconceptions in Python

What are some common misconceptions that people have when programming in Python? Here are a couple that were passed around a mailing list I'm on:

'list.sort' returns the sorted list. (Wrong: it actually returns None.)

Misconception: The Python "is" statement tests for equality.

Reality: The "is" statement checks to see if two variables point to the same object.

This one is especially nasty, because for many cases, it "works", until it doesn't :)

In [1]: a = 'hello'

In [2]: b = 'hello'

In [3]: a is b

Out[3]: True

In [4]: a = 'hello world!'

In [5]: b = 'hello world!'

In [6]: a is b

Out[6]: False

In [7]: a = 3

In [8]: b = 3

In [9]: a is b

Out[9]: True

In [10]: a = 1025

In [11]: b = 1025

In [12]: a is b

Out[12]: False

This happens because the CPython implementation caches small integers and strings, so the underlying objects really are the same, sometimes.

If you want to check if two objects are equivalent, you must always use the == operator.

submitted by rhiever to Python
[link] [243 comments]

Posted on 13 May 2013

What is Python not a good language for?

I am moving from writing one-off code and scripts to developing tools which are going to be used by a larger group. I am having trouble deciding if Python is the right tool for the jobs.

For example I am responsible for process a 1gb text file into some numerical results. Python was the obvious choice for reading the text file but I am wondering if Python is fast enough for production code.

Edit: Thanks for the all responses. I will continue to learn and develop in Python.

submitted by Hopemonster to Python
[link] [230 comments]

Posted on 6 May 2013

Why do you choose Python over other language?

Hi, coding newbie here, I want to know why do you prefer Python over other language and it pro's and con's. Really interesed into learning Python, any tips?

Edit: Wow, such a great feedback, as I see the main Pro is the overall badass community that Python has behind (refer to all the comments in this thread), thanks guys.

Edit 2: The question now. Python 2.x or 3.x?

submitted by Rokxx to Python
[link] [165 comments]

Posted on 5 April 2013

I use PHP. Whenever I meet a Python guy, they tell me how much better it is, and I'd like the low-down on the reasons.

I'm not bothered with the fact that PHP was not designed and has inconsistencies etc., because I know my way around it well enough that it doesn't matter. I'm curious whether using Python would help me, as I don't hear much negativity around it.

What I want to know is, in terms of web dev, are there things Python can do that PHP can't? Is the language so much better that I'll be able to write better code in less time? Is it as fast as PHP, and are the frameworks as varied and battle tested? Are there any shortcomings to Python that would trip me up?

Thanks guys.

submitted by maloney7 to Python
[link] [207 comments]

Posted on 19 November 2011

Are there any things about Python that you do *not* like, or that you wish were done differently, or that you flat out think are wrong?

I lightheartedly joked in another thread that if the person had agreed with my point (that Python 3 seems very slightly harder to code in than Python 2.x - also a lighthearted, almost completely unfounded critique), that it would be the first time I'd ever seen any Python user online agree with any criticism of any part of the language. In this last bit I'm not really joking.

I had many newbie critiques a few years ago - 'self', the fact that you can't join a string list with myList.join(', '), something about slicing that I forget now, that it was confusing which things worked in-place, and which worked on a copy, etc. - and in a forum (not reddit) where I posted up my lengthy list (mostly to see what people thought of these things), I was met with a wall of responses, all strongly in favor of every last part of all of it, and even of things I hadn't mentioned. In 3 years I realize now I have never once seen anyone critique any part of the language and not be met with all manner of deep, philosophical justifications as to why that thing or those things must be that way.

It's the perfect language, I guess.

So my new question is just straight up: IS there anything about Python you don't like? I mean, it is moving to 3, and there are changes, so clearly 2.x had room for improvement, so let's hear it. Be prepared for a battle on all fronts from everyone else in here, though, whatever you say :) I'd love to hear from the real experts, the people who usually wield seemingly powerful reasoning and long strings of computer science words in their arguments.

This itself isn't a critique, nor even a jab, but just another attempt to learn more.

submitted by gfixler to Python
[link] [576 comments]

Posted on 16 November 2011

A website that lets you browse Reddit like you're reading/coding in Python!

...or Java (and soon, Ruby, PHP, C#, etc.).

It's my first website with Flask (my first real dynamic website?). I wanted the domain to be, but it was too expensive :(. So I just asked my brother to help me host it.

Comments appreciated. :)


  • NSFW indicator for Python (can't figure out where/how to place it in Java, but it still checks for NSFW so it won't load image previews)
  • don't preload all images (thanks to canuckkat)
  • use def instead of class in Python


I just opened up the repo at bitbucket :)

Thanks everyone!

submitted by ares623 to Python
[link] [73 comments]

Posted on 6 September 2011

Ask PyReddit: If you were making your own Python-like programming language, what would be different?

We all know Python isn't perfectly perfect, just practically perfect.

With that in mind, what changes would you make if you were brainstorming the ideal Python-like language? For example, do we really need the colon after an if statement? Shouldn't def f(default=[]): work the way you'd expect and not end up with a single global []? Isn't Ruby onto something when it makes mutating methods end in an exclamation mark by convention? And don't we really need a better syntax for passing an anonymous block as a callback? …

What are your ideas, /r/Python?

submitted by earthboundkid to Python
[link] [250 comments]

Posted on 29 May 2011