Saturday, April 18, 2015

Python is a horrible horrible teaching programming language

I have never particularly liked Python. I mean, as a programmer, I consider any coercion into a particular way of doing thing an insult. Thus, the whole there-is-only-one-way-to-do it and benevolent-dictator affairs have never rhyme with my personal philosophy. But I always restrained from criticizing Python. After all, I have never used Python extensively, and my contact with it has never resulted in enough pain for me to hate it.

Well, life changed. I was forced into Python by a class. An algorithm class no less. I love algorithm and theories. Everything there is shiny and flawless, with no wiggle rooms for bugs and the likes. Plus, programming in these situation excites: the problem is well-defined, the graders favor style over pesky optimization, and the solution is polished. This is the exact reverse of profession work, where the result (not the code, but its effect) is everything, and pretty code costs night and weekend (plus lots and lots of fighting).

When I first realized that my class requires Python, excitement actually prevailed briefly. After all, if you ever search for something like "python teaching language," you would see people say all of those glorious things about how python is absolutely beautiful for teaching. I have never liked Python's tyrannical philosophy. However, well, this is a golden chance to learn Python right on its own turf. Maybe I would like it. Maybe my opinion would match that of my friend (apparently Python grew on him after a while). Maybe.

Well, Python crushed my hope with its stupidity (seriously, I have no other names for this), bad design, and generally annoyance to use.

Firstly, let me be very frank: I miss type declaration. I miss it. I mean, production code can sustain lack of type declaration much better than academic code. Why? Because you have tests and documentations and an expectation of proficiency in the language to fill in the blank. Academic code delivers on idea, not execution. So, it should be readable without compiler, without running, without tests, and with minimal documentation. For goodness' sake, the code itself is the documentation of the text (have you read computer science paper? The code explains the English). However, without type declaration, it's impossible to figure out how to use a value without context. Each solution skeleton in my class has 10 lines of comments to to explain the expected type and usage of the input and output. Like 100+ characters which can be easily written in 10 characters in Java or C#. Seriously.

Another thing on type declaration: people keep whining about how much characters they waste. Well, let's ignore my comment on the necessary comments for dynamically typed inputs and outputs, and assume for a moment that you can read the mind of the coder to know how those things should be used. Will dynamic typing save a lot of waste in that case? The answer is no. Remember, we are talking about academic, teaching situation here. The most important virtue here is readability, not efficiency. This generally leads to quite small functions with very few extra variables declaration aside from input of functions, and most of these extra variables are counters (you know, i, j, k, etc.). In most cases, the variables are values passed between functions. You will have to declare them as arguments anyway. Furthermore, because those variables are interfaces between functions, one often wants to document how they should behave, aka write out the types. Thus, the saving here is minimal, if at all. And the readability of dynamic types goes down the drain thanks to the comments.

Talking about variable declaration, since when is declared-when-first-used easy to read? Again, this may be so in production code, where everyone deals with the same set of code days over days. In academic settings, this is bullshit. To determine what the variable should be, one has to look for its first use, usually in the thick of processing. I remember how Pascal was adamant about all variables declared right at the beginning. The requirement stands for a reason: you know, loud and clear, what each variables should behave. No need to read through the code, no need to guess and assume. This helps readability of massive amount of code at once without warming up time. This helps academic code.

Well, let's get over the stupidity of variables and their types. Let's talk about Python inability as a programming language. See, my instructors seem to like something call Numpy a lot. I get it, it's for numerical computation. However, numpy imposes a different set of containers, with vaguely similar syntax but mutually incompatible with Python built-in containers. What does this mean? Either one of two things must be true: Python does not provide a set of interfaces (think List in Java and C#) that is general enough to do things with, or Numpy people are fundamentally stupid and lazy. I pick the first reason. I trust the people. I distrust dictators. This is kinda like Go and APL (never used, only passing insult here; maybe incorrect). Basically, in those languages, all users are bastards who can't do things well. Thus, the only way to actually do anything beside the wills of the language creators is to write fundamentally separated interfaces that resemble the base language. Python is worse: because the variable are dynamically typed, you can't know for sure what the hell is in there. So, if you receive an array, be sure to check where it is from (no type, remember) before doing division and put some numbers in it. Put float in an int array, which is usually OK, may destroy your program.

Oh, this refers me to a third problem: the necessary to run the program to see obvious flaws. I frankly don't understand the whole cheering over runtime error rather than compilation error. It's beyond stupid. In fact, it is an insult to the learners, consider them incapable of reading code. Why? Compilation errors are obvious. Remember, if a computer can pick it up, a reasonably-trained human can. Runtime errors, on the other hands, are subtle and hard to pick out. For example, division of a number over a string can be either a compilation error (since variable must be typed) or run time error. The former case can be picked out easily by human (let's face it, human can read a string variable declaration); the latter cannot (um, what is that variable types again?). When everything is forced to be spelled out, everything is clear. Humans can read them, with or without computers. When everything is, "trust me, it's fine," well, sorry, I don't trust myself to not mistype things.

Btw, I should also point out that this is another difference between production and teaching code. In production, most of the time, you frankly don't know what is supposed to be the solution. At least I don't. So, the usual process involves trying things out, see what happens, then finally code up a final solution based on the evidence (this is especially true for more complicated bugs or novel situations). In class, the reverse is true. The learners start with a class of knowledge to try out, and the code's purpose is to express that knowledge in a more concrete terms. This is true for both theory and more practical classes. Again, each class would have a set of concepts, and the learners should solve the problems by these before coding. Thus, most issues with the code can be picked up by compilation, either by eyes or by machine. Lastly, since teaching code demonstrates a concepts, often it does not need to be run at all. The graders would just compile it in their head, goes through the logics, and give feedback. In such case, runtime errors are terrible. They hide hideous bugs while provide incentives to hack things together. Hacking is not the point of knowledge transfer. Hacking is for production.

Finally, let's talk about indentation. Another big selling points of Python: hear ya hear ya, your code's readability is enforced by your intepreter. Yeah. Let alone the fact that the interpreter fails to detect hosts of issues until runtime, let alone the fact  that input values' behaviors are undefined by default, let alone the fact that basic types may be unusable. Python will help your cosmetic to look good! Bow down to pretty code! This sickens me every time. I used to be TA for a Scheme-based class. Every time (and I meant every time), the love of the students to Scheme would jump about 5-fold after I taught them correct indentation, and they would do this without the compiler acting like Hitler. Every programmers with a few months of experience would indent reasonably well for most C-like programming language. Furthermore, given that teaching code is usually short and sweet, this whole concept of coercion of style is generally useless. On the other hand, I have yet to notice any standardization in naming (you know, internalCapitalization vs. underscores_for_spaces). Maybe I am too green, but it seems like people just do what the hell they like here. Again, teaching code is short and sweet, so this kind of styles matter a great deal. But of course, it's impossible to enforce (maybe Python should outlaw underscore in names, or outlaw mid-name capitalization).

Frankly, here is my impression of Python and its cheerleaders: Python is great for very very beginners whose only interest is to show off. It is very fast to punch in something resembling good code. It is very fast if you don't do anything major. It is very fast to get buggy code "running." However, as soon as you put any serious logics in, Python crumbles like worms under someone's shoes. Its code needs serious context (either comments or the usage site) to make sense; its abstraction layer is lacking; its compilation system cares more about cosmetic than sustain. Basically, everything is wrong. On top of that, it lacks any kind of mind-twisting advancements (I am thinking about Lisp and Rust) or pliability for hacking and playing (I am think about Perl, of course, but C is a good example). It's like a dumb dictator: insists on minor styles but lets bigger problems go unchecked. I sincerely hope I won't cross path with it again. Ever.

No comments:

Post a Comment