# Data : JSON \[ [Wikipedia](https://en.wikipedia.org/wiki/JSON) | [RFC 8529](https://www.rfc-editor.org/rfc/rfc8259) ] ###### Basics - extension: `.json` - media type: `application/json` - encoding: `utf-8` - no comments - no trailing commas ###### Types - number — signed decimal; doesn't distinguish between int and float; may use E notation; no non-numbers (like `NaN`) - string — double-quotes; backslash escapes - boolean — `true`, `false` - array — `[]` - object — `{}` - null — `null` (both a type and a value) ###### JSON Pointers \[ [RFC 6901](https://datatracker.ietf.org/doc/html/rfc6901) ] A JSON Pointer in a URI fragment describes a path to a JSON value. Examples: - `#/foo/0/bar` — eq: `obj[ "foo" ][ 0 ][ "bar" ]` - `https://example.com/schemas/address#properties/street_address` ## Python #### json (stdlib) \[ [docs](https://docs.python.org/3/library/json.html) ] - encoders & decoders preserve order - default encoder can't handle dates/times/datetimes, Decimals, UUIDs, or dataclasses ```python from json import dump, dumps, load, loads, JSONDecodeError, JSONEncoder # may raise JSONDecodeError json.load( fp ) -> data json.loads( str ) -> data parse_float = decimal.Decimal # to parse floats as Decimals instead object_hook = func # see Customize (below) json.dump( data, fp ) # write string to fp json.dumps( data ) -> string indent = 4 # default: None sort_keys = True # default: False cls = JSONEncoder # see Customize (below) default = <func> # see Customize (below) ``` ###### Customize Encoder ```python # called with each decoded dict - return value used in place of dict def custom_object_hook( d ): for k, v in d.items(): if str_is_uuid( v ): d[ k ] = uuid.UUID( v ) return d data = json.loads( "...", object_hook=custom_object_hook ) ``` ###### Customize Decoder Subclass `JSONEncoder` to override the `default` method, which will get called for values that can't otherwise be serialized, and should return a JSON-encodable version of the object or raise `TypeError`. ```python class CustomJSONEncoder( json.JSONEncoder ): def default( self, o ): if isinstance( o, uuid.UUID ): return str( o ) return super().default( o ) json_str = json.dumps( data, cls=CustomJSONEncoder ) ``` ###### As CLI ```bash # to pretty-print some JSON: $ echo '{ "foo": 42 }' | python -m json.tool $ python -m json.tool <file>.json ``` #### orjson \[ [pypi](https://pypi.org/project/orjson/) | [src+docs](https://github.com/ijl/orjson/) ] > "...the fastest Python library for JSON and is more correct than the standard json library or other third-party libraries." Pros: - recommended by the authors of [[DeepDiff]] - supports dates/times/datetimes, UUIDs, dataclasses, numpy Cons: - serializes to bytes instead of strings - no indentation by default; supports option for indent-with-2-spaces, but that's it ```python import orjson data = orjson.loads( json_string ) bytes = orjson.dumps( data, option = orjson.OPT_SORT_KEYS | orjson.OPT_INDENT_2, default = func, # see Extend (below) ) json = bytes.decode( "utf-8" ) ``` ###### Extend Encoder To encode additional types (such as `Decimal`), pass a `default` function which will receive unrecognized objects and must return a supported type or raise `TypeError`. ```python def default( o ): if isinstance( o, decimal.Decimal ): return str( o ) raise TypeError json = orjson.dumps( Decimal( 0.1 ), default=default ).decode( "utf-8" ) ```