Home
_______ __ _______ | | |.---.-..----.| |--..-----..----. | | |.-----..--.--.--..-----. | || _ || __|| < | -__|| _| | || -__|| | | ||__ --| |___|___||___._||____||__|__||_____||__| |__|____||_____||________||_____| on Gopher (inofficial) HTML Visit Hacker News on the Web COMMENT PAGE FOR: HTML Show HN: Greenmask 0.2 â Database anonymization tool wutwutwat wrote 2 hours 49 min ago: Iâve used [1] before to create trimmed down dev databases based on scrubbed and fuzzed production data. HTML [1]: https://postgresql-anonymizer.readthedocs.io/en/latest/ jensenbox wrote 3 hours 19 min ago: Having jumped from Replibyte to Greenmask already I can say it is a significantly better architecture - hands down. imiric wrote 4 hours 16 min ago: It's great seeing more tools in this space. I was recently researching ways of anonymizing production data for staging, and I also found existing tools either cumbersome to setup or lacking in features. I stumbled upon clickhouse-obfuscator[1], and really liked that it worked on standalone dump formats (CSV, Parquet, etc.) rather than any specific DBMS. I think that's a great approach for this, since it keeps things simple and generic, and it can be conveniently added as a middle step in the backup-restore pipeline. Unfortunately, the tool is quite barebones, and has issues maintaining referential integrity, so we had to abandon it. This is still an unsolved problem in our team, so I'll keep an eye on your tool. We would need support for ClickHouse as well, so it's good you're planning support for other DBMSs. Good luck! [1] HTML [1]: https://clickhouse.com/docs/en/operations/utilities/clickhouse... btown wrote 5 hours 47 min ago: This is really awesome - and it's so amazing that you've build this as a standalone tool! I can absolutely speak to the pain of having a dozen pg_dump --exclude-table-data arguments and having a developer experience that makes it difficult to reproduce bugs due to drift between production data and test fixtures (even if they share the same schema, assumptions can change massively!). Secure and robust database cloning also enables preview apps that actually answer the stakeholder question "can I see/play with what the new code would do, if applied to the actual [document/record/product listing] that motivated the feature/bugfix?" Subsetting and PII masking are both critical for this, and it's amazing to see that you've thought about them as integral parts of the same product. I really want to see a product like this succeed! The easier the tool is to use, the harder it might be to monetize... but there are so many applications of a tool like this, including ones that can materially improve security at organizations large and small ( [1] just posted here earlier today remarks on this!) that I'm sure you'll find the right niche! HTML [1]: https://nabeelqu.substack.com/i/150188028/secrets DIR <- back to front page