Determine which columns would be good candidates to be used as foreign keys of a table,
to reference the primary key column of another table of the dm object.
dm_enum_fk_candidates(dm, table, ref_table) enum_fk_candidates(zoomed_dm, ref_table)
| dm | A |
|---|---|
| table | The table whose columns should be tested for suitability as foreign keys. |
| ref_table | A table with a primary key. |
| zoomed_dm | A |
A tibble with the following columns:
columnscolumns of table,
candidateboolean: are these columns a candidate for a foreign key,
whyif not a candidate for a foreign key, explanation for for this.
dm_enum_fk_candidates() first checks if ref_table has a primary key set,
if not, an error is thrown.
If ref_table does have a primary key, then a join operation will be tried using
that key as the by argument of join() to match it to each column of table.
Attempting to join incompatible columns triggers an error.
The outcome of the join operation determines the value of the why column in the result:
an empty value for a column of table that is a suitable foreign key candidate
the count and percentage of missing matches for a column that is not suitable
the error message triggered for unsuitable candidates that may include the types of mismatched columns
enum_fk_candidates() works like dm_enum_fk_candidates() with the zoomed table as table.
These functions are marked "questioning" because we are not yet sure about
the interface, in particular if we need both dm_enum...() and enum...()
variants.
Changing the interface later seems harmless because these functions are
most likely used interactively.
Other foreign key functions:
dm_add_fk(),
dm_get_all_fks(),
dm_get_fk(),
dm_has_fk()
#> # A tibble: 19 x 3 #> columns candidate why #> <keys> <lgl> <chr> #> 1 origin TRUE "" #> 2 dest FALSE "242 entries (2.2%) of `flights$dest` not in `airport… #> 3 tailnum FALSE "11080 entries (98.7%) of `flights$tailnum` not in `a… #> 4 carrier FALSE "11227 entries (100%) of `flights$carrier` not in `ai… #> 5 air_time FALSE "Can't join on `x$value` x `y$value` because of incom… #> 6 arr_delay FALSE "Can't join on `x$value` x `y$value` because of incom… #> 7 arr_time FALSE "Can't join on `x$value` x `y$value` because of incom… #> 8 day FALSE "Can't join on `x$value` x `y$value` because of incom… #> 9 dep_delay FALSE "Can't join on `x$value` x `y$value` because of incom… #> 10 dep_time FALSE "Can't join on `x$value` x `y$value` because of incom… #> 11 distance FALSE "Can't join on `x$value` x `y$value` because of incom… #> 12 flight FALSE "Can't join on `x$value` x `y$value` because of incom… #> 13 hour FALSE "Can't join on `x$value` x `y$value` because of incom… #> 14 minute FALSE "Can't join on `x$value` x `y$value` because of incom… #> 15 month FALSE "Can't join on `x$value` x `y$value` because of incom… #> 16 sched_arr_t… FALSE "Can't join on `x$value` x `y$value` because of incom… #> 17 sched_dep_t… FALSE "Can't join on `x$value` x `y$value` because of incom… #> 18 time_hour FALSE "Can't join on `x$value` x `y$value` because of incom… #> 19 year FALSE "Can't join on `x$value` x `y$value` because of incom…#> # A tibble: 19 x 3 #> columns candidate why #> <keys> <lgl> <chr> #> 1 origin TRUE "" #> 2 dest FALSE "242 entries (2.2%) of `flights$dest` not in `airport… #> 3 tailnum FALSE "11080 entries (98.7%) of `flights$tailnum` not in `a… #> 4 carrier FALSE "11227 entries (100%) of `flights$carrier` not in `ai… #> 5 air_time FALSE "Can't join on `x$value` x `y$value` because of incom… #> 6 arr_delay FALSE "Can't join on `x$value` x `y$value` because of incom… #> 7 arr_time FALSE "Can't join on `x$value` x `y$value` because of incom… #> 8 day FALSE "Can't join on `x$value` x `y$value` because of incom… #> 9 dep_delay FALSE "Can't join on `x$value` x `y$value` because of incom… #> 10 dep_time FALSE "Can't join on `x$value` x `y$value` because of incom… #> 11 distance FALSE "Can't join on `x$value` x `y$value` because of incom… #> 12 flight FALSE "Can't join on `x$value` x `y$value` because of incom… #> 13 hour FALSE "Can't join on `x$value` x `y$value` because of incom… #> 14 minute FALSE "Can't join on `x$value` x `y$value` because of incom… #> 15 month FALSE "Can't join on `x$value` x `y$value` because of incom… #> 16 sched_arr_t… FALSE "Can't join on `x$value` x `y$value` because of incom… #> 17 sched_dep_t… FALSE "Can't join on `x$value` x `y$value` because of incom… #> 18 time_hour FALSE "Can't join on `x$value` x `y$value` because of incom… #> 19 year FALSE "Can't join on `x$value` x `y$value` because of incom…