Video Pending

Created by Zed A. Shaw Updated 2024-02-17 04:54:36

Part V: Parsing Text

This part of the book will teach you about text processing and, specifically, the beginning of parsing text formally. I won't get into all of the different theoretical elements of programming language theory since that is an entire university degree. This is merely the beginning of simple and naive parsing of text in a way that you can use in many programming situations.

Most programmers have a strange relationship with parsing text. The core of all computer programming is parsing, and it's one of the most well understood and formalized aspects of computer science. Parsing data is everywhere in computing. You'll find it in network protocols, compilers, spreadsheets, servers, text editors, graphics renderers, and nearly anything that has to interface with a human or another computer. Even when two computers are sending a fixed binary protocol there is still an aspect of parsing involved despite the lack of text.

I'm going to teach you parsing because it's an easily understood solid technique that produces reliable results. When you're faced with processing some input reliably and giving accurate errors you'll turn to a parser rather than trying to write one by hand. Additionally, once you learn the basics of parsing it becomes easier to learn new programming languages because you can understand their grammar.

Introducing Code Coverage

In this part you should still be attempting to break and take apart any code you write. The new thing I'm adding in this part is the concept of code coverage. The idea of code coverage is you actually have no idea if you've tested most scenarios when you write your automated tests. You could use formal logic to develop a theory that you've covered everything, but as we know the human brain is incredibly terrible at finding flaws in its own thinking. This is why you use a cycle of "create then critique" during this book. It's simply too difficult for you to analyze your own thinking while you're also trying to create something.

Code coverage is a way to at least get an idea of what you've tested in your application. It won't find all your flaws, but it will at least show that you've hit every code branch you possibly can. Without coverage you actually have no idea if you've tested each branch. A very good example is handling failures. Most automated tests only test the most reliable conditions and never test error handling. When you run coverage you find out all the ways you forgot to test error handling code.

Code coverage also helps you avoid overtesting your code. I've worked on projects with Test Driven Development (TDD) enthusiasts who were proud of the 12/1 test/code ratios (meaning 12 lines of tests for every 1 line of code). A simple code coverage analysis showed that they were testing only 30% of the code, and many of those lines were tested the exact same way 6-20 times. Meanwhile, simple errors like exceptional conditions in database queries were completely untested and caused frequent errors. Eventually these kinds of test suites become an albatross that prevents project growth and simply eats up human work schedules. It's no wonder so many Agile consultancies hate code coverage.

During the videos in this exercise you'll see me running tests and using code coverage to confirm what I'm testing. You'll be required to do the same thing, and there are tools that will make this easy. I'll show you how to read the test coverage results and how to make sure you're efficiently testing everything you can. The goal is to have a thorough automated test suite but without wasted effort so you aren't testing just 30% of your code 12 times in a row.

Back to Module Next Lesson

Register for Learn More Python the Hard Way

Register today for the course and get the all currently available videos and lessons, plus all future modules for no extra charge.