Home
        _______               __                   _______
       |   |   |.---.-..----.|  |--..-----..----. |    |  |.-----..--.--.--..-----.
       |       ||  _  ||  __||    < |  -__||   _| |       ||  -__||  |  |  ||__ --|
       |___|___||___._||____||__|__||_____||__|   |__|____||_____||________||_____|
                                                             on Gopher (inofficial)
  HTML Visit Hacker News on the Web
       
       
       COMMENT PAGE FOR:
  HTML   Show HN: Automated smooth Nth order derivatives of noisy data
       
       
        jcgrillo wrote 3 hours 24 min ago:
        This is great! I've taken sort of a passive interest in this topic over
        the years, some papers which come to mind are [1] and [2] but I don't
        think I've seen a real life example of using the Kalman filter before.
        [1]
        
  HTML  [1]: https://www.sciencedirect.com/science/article/abs/pii/00219290...
  HTML  [2]: https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=92410...
       
        Animats wrote 5 hours 1 min ago:
        That's useful. Can it generate a simple filter for later real-time use,
        based on the statistics of the noise? That would be useful for
        self-tuning controllers.
       
        theaussiestew wrote 5 hours 30 min ago:
        I'm looking to calculate jerk from accelerometer data, I'm assuming
        this would be the perfect use case?
       
          hugohadfield wrote 5 hours 22 min ago:
          this is a perfect use case, let me know how it goes!
       
        pm wrote 6 hours 14 min ago:
        Congratulations! Pardon my ignorance, as my understanding of
        mathematics at this level is beyond rusty, but what are the
        applications of this kind of functionality?
       
          caseyy wrote 1 hour 42 min ago:
          This is very important in controllers using feedback loops. The
          output of a controller is measured, a function is applied to it, and
          the result is fed back into the controller. The output becomes
          self-balancing.
          
          The applications in this case involve self-driving cars, rocketry,
          homeostatic medical devices like insulin pumps, cruise control, HVAC
          controllers, life support systems, satellites, and other scenarios.
          
          This is mainly due to a type of controller called the PID controller
          which involves a feedback loop and is self-balancing. The purpose of
          a PID controller is to induce a target value of a measurement in a
          system by adjusting the system’s inputs, at least some of which are
          outputs of the said controller. Particularly, the derivative term of
          a PID controller involves a first order derivative. The smoother its
          values are over time, the better such a controller performs. A
          problem where derivative values are not smooth or second degree
          derivative is not continuous, is called a “derivative kick”.
          
          The people building these controllers have long sought after
          algorithms that produce at least a good approximation of a
          measurement from a noisy sensor. A good approximation of derivatives
          is the next level, a bit harder, and overall good approximations of
          the derivative are a relatively recent development.
          
          There is a lot of business here. For example, Abbott Laboratories and
          Dexcom are building continuous blood glucose monitors that use a
          small transdermal sensor to sense someone’s blood glucose. This is
          tremendously important for management of people’s diabetes. And yet
          algorithms like what GP presents are some of the biggest hurdles. The
          sensors are small and ridiculously susceptible to noise. Yet it is
          safety-critical that the data they produce is reliable and up to date
          (can’t use historical smoothing) because devices like insulin pumps
          can consume it at real time. I won’t go into this in further
          detail, but incorrect data can and has killed patients. So a good
          algorithm for cleaning up this noisy sensor data is both a serious
          matter and challenging.
          
          The same can be said about self-driving cars - real-time data from
          noisy sensors must be fed into various systems, some using PID
          controllers. These systems are often safety-critical and can kill
          people in a garbage in-garbage out scenario.
          
          There are about a million applications to this algorithm. It is
          likely an improvement on at least some previous implementations in
          the aforementioned fields. Of course, these algorithms also often
          don’t handle certain edge cases well. It’s an ongoing area of
          research.
          
          In short — take any important and technically advanced
          sensor-controller system. There’s a good chance it benefits from
          advancements like what GP posted.
          
          P.S. it’s more solved with uniformly sampled data (i.e. every N
          seconds) than non-uniformly sampled data (i.e. as available). So once
          again, what GP posted is really useful.
          
          I think they could get a job at pretty big medical and automotive
          industry companies with this, it is “the sauce”. If they
          weren’t already working for a research group of a self-driving car
          company, that is ;)
       
          thatcherc wrote 5 hours 2 min ago:
          I actually have one for this! Last week I had something really
          specific - a GeoTIFF image where each pixel represents the speed in
          "x" direction of the ice sheet surface in Antarctica and I wanted to
          get the derivative of that velocity field so I could look at the
          strain rate of the ice.
          
          A common way to do that is to use a Savitzky-Golay filter [0], which
          does a similar thing - it can smooth out data and also provide smooth
          derivatives of the input data. It looks like this post's technique
          can also do that, so maybe it'd be useful for my ice strain-rate
          field project.
          
          [0] -
          
  HTML    [1]: https://en.wikipedia.org/wiki/Savitzky%E2%80%93Golay_filter
       
            defrost wrote 3 hours 29 min ago:
            I've been a heavy user of Savitzky-Golay filters (linear time
            series, rectangular grid images, cubic space domains | first,
            second and third derivitives | balanced and unbalanced (returning
            central region smoothed values and values at edges)) since the
            1980s.
            
            The usual implementation is as a convolution filter based on the
            premise that the underlying data is regularly sampled.
            
            The pain in the arse occassional reality is missing data and|or
            present but glitched|spiked data .. both of which require a
            "sensible infill" to continue with a convolution.
            
            This is a nice implementation and a potentially useful bit of kit-
            the elephant in the room (from my PoV) is "how come the application
            domain is irregularly sampled data"?
            
            Generally (engineering, geophysics, etc) great lengths are taken to
            clock data samples like a metronome (in time and|or space (as
            required most)).
            
            I'm assuming that your gridded GeoTIFF data field is regularly
            sampled in both the X and Y axis?
       
            pm wrote 4 hours 49 min ago:
            Thanks for that, it looks like my research today is cut out for me.
       
          uoaei wrote 5 hours 12 min ago:
          Basically, approximating calculus operations on noisy,
          discrete-in-time data streams.
       
            pm wrote 4 hours 47 min ago:
            This is what I was thinking, but stated much clearer than I'd have
            managed.
       
          hugohadfield wrote 5 hours 13 min ago:
          No problem! Let's dream up a little use case:
          
          Imagine you have a speed sensor eg. on your car and you would like to
          calculate the jerk (2nd derivative of speed) of your motion (useful
          in a range of driving comfort metrics etc.). The speed sensor on your
          car is probably not all that accurate, it will give some slightly
          randomly wrong output and it may not give that output at exactly 10
          times per second, you will have some jitter in the rate you receive
          data. If you naiively attempt to calculate jerk by doing central
          differences on the signal twice (using np.gradient twice) you will
          amplify the noise in the signal and end up with something that looks
          totally wrong which you will then have to post process and maybe
          resample to get it at the rate that you want. If instead of
          np.gradient you use kalmangrad.grad you will get a nice smooth jerk
          signal (and a fixed up speed signal too).
          There are many ways to do this kind of thing, but I personally like
          this one as its fast, can be run online, and if you want you can get
          uncertainties in your derivatives too :)
       
            pm wrote 4 hours 50 min ago:
            I'd been researching Kalman filters to smooth out some sampling
            values (working on mobile: anything from accelerometer values to
            voice activation detection), but hadn't got around to revising the
            mathematics, so I appreciate the explanation.  Out of curiosity,
            what other ways might this be achieved? I haven't seen much else
            beyond Kalman filters.
       
       
   DIR <- back to front page