# DML DML stands for "Data Manipulation Language" and relates to inserting and modifying data in tables. ## COPY Copies the contents of a table or query to file(s). Supported file formats are `parquet`, `csv`, `json`, and `arrow`.
COPY { table_name | query }
TO 'file_name'
[ STORED AS format ]
[ PARTITIONED BY column_name [, ...] ]
[ OPTIONS( option [, ... ] ) ]
`STORED AS` specifies the file format the `COPY` command will write. If this clause is not specified, it will be inferred from the file extension if possible. `PARTITIONED BY` specifies the columns to use for partitioning the output files into separate hive-style directories. By default, columns used in `PARTITIONED BY` will be removed from the output format. If you want to keep the columns, you should provide the option `execution.keep_partition_by_columns true`. `execution.keep_partition_by_columns` flag can also be enabled through `ExecutionOptions` within `SessionConfig`. The output format is determined by the first match of the following rules: 1. Value of `STORED AS` 2. Filename extension (e.g. `foo.parquet` implies `PARQUET` format) For a detailed list of valid OPTIONS, see [Format Options](format_options.md). ### Examples Copy the contents of `source_table` to `file_name.json` in JSON format: ```sql > COPY source_table TO 'file_name.json'; +-------+ | count | +-------+ | 2 | +-------+ ``` Copy the contents of `source_table` to one or more Parquet formatted files in the `dir_name` directory: ```sql > COPY source_table TO 'dir_name' STORED AS PARQUET; +-------+ | count | +-------+ | 2 | +-------+ ``` Copy the contents of `source_table` to multiple directories of hive-style partitioned parquet files: ```sql > COPY source_table TO 'dir_name' STORED AS parquet, PARTITIONED BY (column1, column2); +-------+ | count | +-------+ | 2 | +-------+ ``` If the data contains values of `x` and `y` in column1 and only `a` in column2, output files will appear in the following directory structure: ```text dir_name/ column1=x/ column2=a/ .parquet .parquet ... column1=y/ column2=a/ .parquet .parquet ... ``` Run the query `SELECT * from source ORDER BY time` and write the results (maintaining the order) to a parquet file named `output.parquet` with a maximum parquet row group size of 10MB: ```sql > COPY (SELECT * from source ORDER BY time) TO 'output.parquet' OPTIONS (MAX_ROW_GROUP_SIZE 10000000); +-------+ | count | +-------+ | 2 | +-------+ ``` ## INSERT ### Examples Insert values into a table.
INSERT INTO table_name { VALUES ( expression [, ...] ) [, ...] | query }
```sql > INSERT INTO target_table VALUES (1, 'Foo'), (2, 'Bar'); +-------+ | count | +-------+ | 2 | +-------+ ```