Home
        _______               __                   _______
       |   |   |.---.-..----.|  |--..-----..----. |    |  |.-----..--.--.--..-----.
       |       ||  _  ||  __||    < |  -__||   _| |       ||  -__||  |  |  ||__ --|
       |___|___||___._||____||__|__||_____||__|   |__|____||_____||________||_____|
                                                             on Gopher (inofficial)
  HTML Visit Hacker News on the Web
       
       
       COMMENT PAGE FOR:
  HTML   Show HN: Greenmask 0.2 – Database anonymization tool
       
       
        wutwutwat wrote 2 hours 49 min ago:
        I’ve used [1] before to create trimmed down dev databases based on
        scrubbed and fuzzed production data.
        
  HTML  [1]: https://postgresql-anonymizer.readthedocs.io/en/latest/
       
        jensenbox wrote 3 hours 19 min ago:
        Having jumped from Replibyte to Greenmask already I can say it is a
        significantly better architecture - hands down.
       
        imiric wrote 4 hours 16 min ago:
        It's great seeing more tools in this space.
        
        I was recently researching ways of anonymizing production data for
        staging, and I also found existing tools either cumbersome to setup or
        lacking in features.
        
        I stumbled upon clickhouse-obfuscator[1], and really liked that it
        worked on standalone dump formats (CSV, Parquet, etc.) rather than any
        specific DBMS. I think that's a great approach for this, since it keeps
        things simple and generic, and it can be conveniently added as a middle
        step in the backup-restore pipeline. Unfortunately, the tool is quite
        barebones, and has issues maintaining referential integrity, so we had
        to abandon it.
        
        This is still an unsolved problem in our team, so I'll keep an eye on
        your tool. We would need support for ClickHouse as well, so it's good
        you're planning support for other DBMSs. Good luck!
        
        [1] 
        
  HTML  [1]: https://clickhouse.com/docs/en/operations/utilities/clickhouse...
       
        btown wrote 5 hours 47 min ago:
        This is really awesome - and it's so amazing that you've build this as
        a standalone tool!
        
        I can absolutely speak to the pain of having a dozen pg_dump
        --exclude-table-data arguments and having a developer experience that
        makes it difficult to reproduce bugs due to drift between production
        data and test fixtures (even if they share the same schema, assumptions
        can change massively!).
        
        Secure and robust database cloning also enables preview apps that
        actually answer the stakeholder question "can I see/play with what the
        new code would do, if applied to the actual [document/record/product
        listing] that motivated the feature/bugfix?" Subsetting and PII masking
        are both critical for this, and it's amazing to see that you've thought
        about them as integral parts of the same product.
        
        I really want to see a product like this succeed! The easier the tool
        is to use, the harder it might be to monetize... but there are so many
        applications of a tool like this, including ones that can materially
        improve security at organizations large and small ( [1] just posted
        here earlier today remarks on this!) that I'm sure you'll find the
        right niche!
        
  HTML  [1]: https://nabeelqu.substack.com/i/150188028/secrets
       
       
   DIR <- back to front page