12

Unavoidable bugs (Read 1443 times)

eric :)


    When it comes to software engineering, bugs are expected.  We don't purposefully write bad code, although some programmers might if they're disgruntled.  Most of us take pride in our work.

    Most bugs could be avoided simply by thinking through the logics.  Others are harder to find because they involve interactions amongst multiple components.  They only happen under very specific conditions and are nightmares to replicate.  Still, some are completely beyond a programmer's control.

    Whenever RA encounters an unexpected error, it sends me an email with details about the problem to help me fix it.  I get more of these emails after a site update.

    RA was unavailable for many users yesterday morning.  I received tens of thousands of emails from RA describing the same problem: "There is already an open DataReader associated with this Connection which must be closed first."  You don't have to understand what it means other than that was the error I received and it was related to the database.  This error is considered catastrophic because it prevented users from using the site.  Is is almost as bad is the site being completely down.

    I encountered this problem years ago and only during development.  I made the necessary changes and it never happened again.  I was quite perplexed when I saw this error yesterday because I didn't think it was possible anymore.  Since the computer never lies, the problem was there and finding it would be like finding a needle in a haystack.

    I examined all the changes I made in the past several days, thinking one of them may have caused it since the problem only popped up yesterday.  Everything looked fine.  I looked at all the changes in specific components that are part of the latest update and that came up empty as well.

    As a last resort, I went through all the source code that interacted with the database but didn't find anything obvious.  Without a smoking gun, the only thing I could do was to add a bit of code that guarantees this problem won't happen ever again.

    I rolled out the fix last night when the site was the least busy and went back to working on other stuff.  Several hours later, RA sent me emails with the same problem again.  The only way to fix it was to restart the web server.  I can't monitor it constantly.  Crap!

    I looked at server logs, database logs, performance monitors and whatever else I can think of but nothing obvious.  I looked at many of the emails hoping to find a pattern but nothing incriminating.  Half an hour later, another torrent of emails arrived, signaling the return of the problem.  I restarted the web server and resumed my investigation.

    In both batches of emails, the majority of the page requests were by a web crawler called Baiduspider+.  It is a Chinese search engine.  What's more, this crawler requested many pages per second.  The first batch of emails occurred just as the nightly backup completed.  It is highly possible that this problem occurs only when the server is at a high load.

    I overloaded my test environment to replicate the scenario but was not able to reproduce the problem.  Still, I know this has to be the cause.  Out of desperation, I Googled the problem.  After several tries with different keywords, I finally found my answer.

    The latest RA release contains an updated MySQL database access library, ironically to fix another bug.  Someone reported a critical bug against the access library that occurs when the database is under high load with the same error message.  Bingo!  There is a fix but it is not released for public consumption yet.  Since this fix is vital to RA, I had to download the latest source code and build it myself.  I will upload the fix tonight when the server is less busy.

    As a whole, I am happy to know that I was not responsible for the bug despite spending over a day tracking down the problem.  It goes to show that even if you write perfect bug-free code, bugs are still inevitable.

    CanadianMeg


    #RunEveryDay

      Thank you for all the work you do to keep this wonderful site running. It is appreciated! Smile

      Half Fanatic #9292. 

      Game Admin for RA Running Game 2023.

      xor


        Thanks for all the work, eric with the sideways smile.

        'Since the computer never lies'

        I know I've made some very poor decisions recently, but I can give you my complete assurance that my work will be back to normal. I've still got the greatest enthusiasm and confidence in the mission. And I want to help you.

         

        HappyFeat


          Thank you for all the work you do to keep this wonderful site running. It is appreciated! Smile

          Same goes for me. It is sooooo very much appreciated. Thank you, thank you, thank you!

          Don't make excuses for why you can't get it done. 

          Focus on all the reasons why you must make it happen.

          Ojo


            Thank you!  Thank you! 

            Sara

            MM #2929


            You'll ruin your knees!

              So glad you overcame this challenge, EricSmile


              Thanks so much for allowing me to keep my mile collection here

              ""...the truth that someday, you will go for your last run. But not today—today you got to run." - Matt Crownover (after Western States)

                I don't have the faintest idea what all that computer talk means, but it makes my head hurt in sympathy. Thanks for the sleuthing.
                zoom-zoom


                rectumdamnnearkilledem

                  Eric...hunting bugs with the prowess of a mighty tiger:


                  Getting the wind knocked out of you is the only way to

                  remind your lungs how much they like the taste of air.    

                       ~ Sarah Kay

                  keeponrunning


                    I don't have the faintest idea what all that computer talk means, but it makes my head hurt in sympathy. Thanks for the sleuthing.

                     +1

                    Sulphur Springs 50km-- Ancaster, ON-- May 28, 2022

                    Tally in the Valley 12 hours-- Dundas, ON -- July 30, 2022 (Support SickKids Toronto)

                    Stokely Creek-- 56km-- Sault Ste. Marie, ON-- Sept. 24, 2022

                     

                     

                      There is no doubt whatsoever, you da man.

                      E.J.
                      Greater Lowell Road Runners
                      Cry havoc and let slip the dawgs of war!

                      May the road rise to meet you, may the wind be always at your back, may the sun shine warm upon your SPF30, may the rains fall soft upon your sweat-wicking hat, and until you hit the finish line may The Flying Spaghetti Monster hold you in the hollow of His Noodly Appendage.

                      Trent


                      Good Bad & The Monkey

                        Oh.  I thought this was another insect encounters thread.

                        Wink

                        GREAT JOB RA-KING!
                        LedLincoln


                        not bad for mile 25

                          Great job, Eric Smile   Thanks for all your brainwork and dedication.
                          LedLincoln


                          not bad for mile 25

                            Huh, I was trying to type Eric with the sideways smile, but the system converted it to a bright yellow happy face.  Oh well.
                            Trent


                            Good Bad & The Monkey

                              Gotta ad a space between the : and the )


                              Non ducor, duco.

                                I thought this said unavoidable HUGS. 

                                 

                                hehe.

                                12