A Smackerel of Opinion: bugs

Showing posts with label bugs. Show all posts

Saturday, 5 January 2019

Kernel commits with "Fixes" Tag (revisited)

Last year I wrote about kernel commits that are tagged with the "Fixes" tag. Kernel developers use the "Fixes" tag on a bug fix commit to reference an older commit that originally introduced the bug. The adoption of the tag has been steadily increasing since v3.12 of the kernel:

The red line shows the number of commits per release of the kernel, and the blue line shows the number of commits that contain a "Fixes" tag.

In terms of % of commits that contain the "Fixes" tag, one can see it has been steadily increasing since v3.12 and almost 12.5% of kernel commits in v4.20 are tagged this way.

The fixes tag contains the commit SHA of the commit that was fixed, so one can look up the date of the fix and of the commit being fixed and determine the time taken to fix a bug.

As one can see, a lot of issues get fixed on the first few hundred days, and some bugs take years to get fixed. Zooming into the first hundred days of fixes the distribution looks like:

..the modal point is at day 4, I suspect these are issues that get found quickly when commits land in linux-next and are found in early testing, integration builds and static analysis.

Out of the thousands of "Fixes" tagged commits and the time to fix an issue one can determine how long it takes to fix a specific percentage of the bugs:

In the graph above, 50% of fixes are made within 151 days of the original commit, ~69% of fixes are made within a year of the original commit and ~82% of fixes are made within 2 years. The long tail indicates that there are some bugs that take a while to be found and fixed, the final 10% of bugs take more than 3.5 years to be found and fixed.

Comparing the time to fix issues for kernel versions v4.0, v4.10 and v4.20 for bugs that are fixed in less than 50 days we have:

... the trends are similar, however it is worth noting that more bugs are getting found and fixed a little faster in v4.10 and v4.20 than v4.0. It will be interesting to see how these trends develop over the next few years.

Friday, 1 September 2017

Static analysis on the Linux kernel

There are a wealth of powerful static analysis tools available nowadays for analyzing C source code. These tools help to find bugs in code by just analyzing the source code without actually having to execute the code. Over that past year or so I have been running the following static analysis tools on linux-next every weekday to find kernel bugs:

Typically each tool can take 10-25+ hours of compute time to analyze the kernel source; fortunately I have a large server at hand to do this. The automated analysis creates an Ubuntu server VM, installs the required static analysis tools, clones linux-next and then runs the analysis. The VMs are configured to minimize write activity to the host and run with 48 threads and plenty of memory to try to speed up the analysis process.

At the end of each run, the output from the previous run is diff'd against the new output and generates a list of new and fixed issues. I then manually wade through these and try to fix some of the low hanging fruit when I can find free time to do so.

I've been gathering statistics from the CoverityScan builds for the past 12 months tracking the number of defects found, outstanding issues and number of defects eliminated:

As one can see, there are a lot of defects getting fixed by the Linux developers and the overall trend of outstanding issues is downwards, which is good to see. The defect rate in linux-next is currently 0.46 issues per 1000 lines (out of over 13 million lines that are being scanned). A typical defect rate for a project this size is 0.5 issues per 1000 lines. Some of these issues are false positives or very minor / insignficant issues that will not cause any run time issues at all, so don't be too alarmed by the statistics.

Using a range of static analysis tools is useful because each one has it's own strengths and weaknesses. For example smatch and sparse are designed for sanity checking the kernel source, so they have some smarts that detect kernel specific semantic issues. CoverityScan is a commercial product however they allow open source projects the size of the linux-kernel to be built daily and the web based bug tracking tool is very easy to use and CoverityScan does manage to reliably find bugs that other tools can't reach. Cppcheck is useful as scans all the code paths by forcibly trying all the #ifdef'd variations of code - which is useful on the more obscure CONFIG mixes.

Finally, I use clang's scan-build and the latest verion of gcc to try and find the more typical warnings found by the static analysis built into modern open source compilers.

The more typical issues being found by static analysis are ones that don't generally appear at run time, such as in corner cases like error handling code paths, resource leaks or resource failure conditions, uninitialized variables or dead code paths.

My intention is to continue this process of daily checking and I hope to report back next September to review the CoverityScan trends for another year.

Tuesday, 27 January 2015

Finding kernel bugs with cppcheck

For the past year I have been running the cppcheck static analyzer against the linux kernel sources to see if it can detect any bugs introduced by new commits. Most of the bugs being found are minor thinkos, null pointer de-referencing, uninitialized variables, memory leaks and mistakes in error handling paths.

A useful feature of cppcheck is the --force option that will check against all the configurations in the source (and the kernel does have many!). This allows us to check for code that may not be exercised much (because it is normally not built in with most config options) or even find dead code.

The downside of using the --force option is that each source file may need to be checked multiple times for each configuration. For ~20800 sources files this can take a 24 processor server several hours to process. Errors and warnings are then compared to previous runs (a delta), making it relatively easy to spot new issues on each run.

We also use the latest sources from the cppcheck git repository. The upside of this is that new static analysis features are used early and this can result in finding existing bugs that previous versions of cppcheck missed.

A typical cppcheck run against the linux kernel source finds about 600 potential errors and 1700 warnings; however a lot of these are false positives. These need to be individually eyeballed to sort the wheat from the chaff.

Finally, the data is passed through a gnu plot script to generate a trend graph so I can see how errors (red) and warnings (green) are progressing over time:

..note that the large changes in the graph are mostly with features being enabled (or fixed) in cppcheck.

I have been running the same experiment with smatch too, however I am finding that cppcheck seems to have better code coverage because of the --force option and seems to have less false positives. As it stands, I am finding that the most productive time for finding issues is around the -rc1 and -rc2 merge times (obviously when most of the the major changes land in the kernel). The outcome of this work has been a bunch of small fixes landing in the kernel to address bugs that cppcheck has found.

Anyhow, cppcheck is an excellent open source static analyzer for C and C++ that I'd heartily recommend as it does seem to catch useful bugs.

Friday, 28 February 2014

Finding small bugs

Over the past few months I've been using static code analysis tools such as cppcheck, Coverity Scan and also smatch on various open source projects. I've generally found that most open source code is fairly well written, however, most suffer a common pattern of bugs on the error handling paths. Typically, these are not free'ing up memory or freeing up memory incorrectly. Other frequent bugs are not initialising variables and overly complex code paths that introduce subtle bugs when certain rare conditions are occur. Most of these bugs are small and very rarely hit; some of these just silently do things wrong while others can potentially trigger segmentation faults.

The --force option in cppcheck to force the checking of every build configuration has been very useful in finding code paths that are rarely built, executed or tested and hence are likely to contain bugs.

I'm coming to the conclusion that whenever I have to look at some new code I should take 5 minutes or so throwing it at various static code analysis tools to see what pops out and being a good citizen and fixing these and sending these upstream. It's not too much effort and helps reduce some of those more obscure bugs that rarely bite but do linger around in code.

Tuesday, 21 August 2012

Testing eCryptfs

Over the past several months I've been occasionally back-porting a bunch of eCryptfs patches onto older Ubuntu releases. Each back-ported fix needs to be properly sanity checked and so I've been writing test cases for each one and adding them to the eCryptfs test suite.

To get hold of the test suite, check it out using bzr:

 bzr checkout lp:ecryptfs

and install the dependencies so one can build the test suite:

 sudo apt-get install debhelper autotools-dev autoconf automake \
 intltool libtool libgcrypt11-dev libglib2.0-dev libkeyutils-dev \
 libnss3-dev libpam0g-dev pkg-config python-dev swig acl \
 ecryptfs-utils libattr1-dev

If you want to test eCrytpfs with xfs and btrfs as the lower file system onto which eCryptfs is mounted, then one needs to also install the tools for these:

 sudo apt-get install xfsprogs btrfs-tools

And then build the test programs:

 cd ecryptfs  
 autoreconf -ivf  
 intltoolize -c -f  
 ./configure --enable-tests --disable-pywrap  
 make

To run the tests, one needs to create lower and upper mount points. The tests allow one to create ext2, ext3, ext4, xfs or btrfs loop-back mounted file systems on the lower mount point, and then eCryptfs is mounted on the upper mount point on top. To create these, use something like:

 sudo mkdir /lower /upper

The loop-back file system image needs to be placed somewhere too, I generally place mine in a directory /tmp/image, so this needs creating too:

 mkdir /tmp/image

There are two categories of tests, "safe" and "destructive". Safe tests should run in such a ways as to not lock up the machine. Destructive tests try hard to force bugs that can cause kernel oopses or panics. One specifies the test category with the -c option. Now to run the tests, use:

 sudo ./tests/run_tests.sh -K -c safe -b 1000000 -D /tmp/image -l /lower -u /upper

The -K option tells the test suite to run the kernel specific tests. These are the ones I am generally interested in since I'm testing kernel patches.

The -b option specifies the size in 1K blocks of the loop-back mounted /lower file system size. I generally use 1000000 blocks as a minimum.

The -D option specifies the path where the temporary loop-back mounted image is kept and the -l and -u options specified the paths of the lower and upper mount points.

By default the tests will use an ext4 lower filesystem. One can also run specify which file systems to run the tests on using the -f option, this can be a comma separated list of one or more file systems, for example:

 sudo ./tests/run_tests.sh -K -c safe -b 1000000 -D /tmp/image -l /lower -u /upper \
 -f ext2,ext3,ext4,xfs

And also, instead of running a bunch of tests, one can just a particular test using the -t option:

 sudo ./tests/run_tests.sh -K -c safe -b 1000000 -D /tmp/image -l /lower -u /upper \
 -f ext2,ext3,ext4,xfs -t lp-926292.sh

..which tests the fix for LaunchPad bug 926292

We also run these tests regularly on new kernel images to ensure we don't introduce and regressions. As it stands, I'm currently adding in tests for each bug fix that we back-port and for most new bugs that require a thorough test. I hope to expand the breadth of the tests to ensure we get better general test coverage.

And finally, thanks to Tyler Hicks for writing the test framework and for his valuable help in describing how to construct a bunch of these tests.

Thursday, 8 October 2009

Ubuntu Kernel Team Bug Policies

Got a Ubuntu kernel related bug and you need help in report it? Or got an audio bug that's kernel related and you want to log the bug? Need advice or hints on how to gather kernel oops messages into a bug report? Or need to figure out how to report a bug upstream?

Well if you need this kind of help, then look no further than the Ubuntu Kernel Bug Policies wiki page. It's got a load of helpful information on kernel bug reporting and also how bug states are recorded from being initially reported, triaged, processed by a kernel developer and fixed.

We hope this will take the pain out of reporting bugs and helping you understand the bug fixing process. Kudos to Leann Ogasawara for this wiki page!

Tuesday, 18 August 2009

ACPI Wake Alarm Bugs

A neat feature with most PCs is the ability to program the Real Time Clock (RTC) to trigger an alarm event at some time in the future and wake the PC up. Some projects such as MythTV use this to wake up a PC to record a TV programme. I've used it to make my laptop wake me up when I've forgotten my alarm clock when I've been travelling.

The method of programming the alarm is relatively straight forward - with the older kernels one would write the date and time to wake up into /proc/acpi/alarm and with newer kernels since 2.6.22 use /sys/class/rtc/rtc0/wakealarm. The API documented fairly well at the ACPI wakeup wiki page at mythtv.org.

The ACPI wake alarm is also used by the Ubuntu Kernel Team suspend/resume tests; these tests suspend a laptop for a specific amount of time and re-awaken the machine for programmatic suspend/resume soak testing. Unfortately, we are finding that some machines have a flakey or broken ACPI implementation for the wake alarm - for example, some machines need the alarm setting to zero or being set twice to work.

One can test quite easily of the ACPI wake alarm works - simply program it to trigger in some short time in the future and check to see if the alarm_IRQ field in /proc/driver/rtc is set to "yes". The second test is to wait until the alarm has triggered and check if the same alarm_IRQ field is now set to "no".

I've hacked up a quick-n-dirty test script to check this out and it can be found it my test script git repository here. Run this test and see - if your machine fails then it may be a good idea to check out the MythTV ACPI wake alarm troubleshooting section here. It may be a BIOS bug that's fixed with a new BIOS upgrade. Failing that, one should file a bug with upstream.

So far, I've see this problem on Lenovo 0768 series laptops, but I expect to see it on other machines too.

UPDATE: It appears the issue is more to do with the hwclock not being in UTC. One can sleep correctly using:

SECS=$((`cat /sys/class/rtc/rtc0/since_epoch`+$SLEEP_SECS))
echo 0 > /sys/class/rtc/rtc0/wakealarm
echo $SECS > /sys/class/rtc/rtc0/wakealarm

Note that I've written to wakealarm twice, the first time writes a zero which apparently helps some buggy BIOSs.

Or the heavy handed way (which will give Windows a headache):

hwclock --utc --systohc

Ubuntu Kernel Bug Days

Every two weeks the Ubuntu Kernel Team have special Kernel Bug days where we focus our attention on specific kernel bug categories. The desire is to attack a bunch of bugs that need some more love and attention - there are an awful lot of bugs and they all take time and effort to resolve.

The good news is that we also like participation from community members to help triage and help fix bugs. Sometimes you may have the same hardware or know of a fix to a specific kernel, hardware or BIOS related bug and thus you may be able to help solve a few of those lingering kernel bugs!

For more information on about our Kernel Team Bug Days, please visit
https://wiki.ubuntu.com/KernelTeam/BugDay - feel free to join in and help solve some kernel bugs!

Tuesday, 21 July 2009

Regression Ponderings

Today I was digging around an audio bug where an earlier fix for one model of hardware was breaking things for a slightly different model.

The fix that had introduced the bug had modified generic code for a subset of hardware and in doing so broke the code for all the other cases. The fix should have been a quirk for a specific subset of hardware. This made me wonder how many bugs are like this - somebody fixes a bug which then causes regressions for almost identical classes of hardware. It would be interesting to know how many regressions occur like this.

The problem is that when dealing low level tinkering, one can only really test one's own hardware and produce a fix for that; making sure it's not causing regressions is hard as one usually does not have all the other hardware to test against. Also, for this particular bug it's not totally clear in hindsight that the original fix was flawed; eye-balling the code may have picked up the bug earlier - but this can be incredibly difficult to do and get right 100% of the time.

After digging down and identifying *where* the code was wrong was the hard part.
As it was, diff'ing the buggy code with some upstream quickly showed me a fix and backporting this was a no-brainer.

Sunday, 7 June 2009

How about freezing new features and instead improve existing code quality?

Software has the tendency to get more and more features added over time. This leads to more bloat and newer bugs. Don't get me wrong, it's good to get new features as it makes a system more usable. Perhaps we should occasionally stop pushing in new code and instead focus on:

1) Fixing outstanding critical or high priority bugs
2) Remove unwanted or old crufty code
3) Optimize code for speed
4) Try to reduce unwanted bloat

It's a bit like going to one's doctor and getting a periodic heath check. We humans put on weight, get unfit and occasionally need to get back into trim before taking on new challenges in life. Maybe we need to do the same with our software...