sane-tsv/readme.md

46 lines
2.2 KiB
Markdown
Raw Normal View History

2024-02-16 04:26:56 +00:00
# Sane TSV
## Roadmap
- Improve error reporting by including line/column information in exceptions
2024-03-16 16:29:50 +00:00
- Use this to get line numbers for parallel parsing implementations
- [x] Come up with a static-typing interface
2024-02-16 04:26:56 +00:00
Something that doesn't require an array of objects
2024-03-16 16:29:50 +00:00
Use a class with SaveTsv attributes
2024-02-16 04:26:56 +00:00
- Check numeric formatting matches spec
2024-03-16 16:29:50 +00:00
- [x] Maybe add a binary representation for f32/f64. It should specify that it is Little-endian (since we have to pick one). That way we can guarantee bit-compatibility between implementations where an application might require that.
- [x] Add Column name/type specification to API
- So you can tell it what columns to expect
- [ ] Lax/strict versions
See the attributes thing above
- Generate test cases
- [x] File comment / no file comment
- [x] header types / no header types
- [x] Line comments / no line comments
- [x] end of file comment
- [x] Test with the start index of parallel methods in last record
- end index in first record
- [x] Extra \n at end of file
- [x] Wrong number of fields
- Wrong number of fields at end of file
- [x] Do parallel parsing / serializing implementation
- [x] Next task: Refactor parsing so that it will start and end at arbitrary indices and return an array of SaneTsvRecords. The refactor should ignore the current record (unless at the start of the buffer) and continue parsing the record the end index is in.
- ~~More optimization and making parsing modular:~~
2024-02-16 04:26:56 +00:00
- Have callbacks for header parsing and field parsing
- That way other formats (like ExtraTSV) don't have to iterate through the entire set of data again.
2024-03-16 16:29:50 +00:00
- [x] Make untyped Simple TSV (De)serialization
- [x] ~~Finish~~ Minimal ExtraTSV implementation
- [ ] Do zig implementation
2024-02-16 04:26:56 +00:00
- Make a c interface from that
2024-03-16 16:29:50 +00:00
- Make a commandline interface
- Make a viewer / editor
- Streaming interface
So you can start processing your data while it finishes parsing?
- [ ] Decoding a binary stream with a \0 in it via UTF-8 doesn't seem to cause any issues. I thought that valid UTF-8 wouldn't have a \0?
- [ ] Instead of exceptions when parsing, we should parse as much as possible and reflect parsing errors in the returned data structure