Making A Single Prediction¶
from ml4pd import components
from ml4pd.streams import MaterialStream
from ml4pd.aspen_units import Distillation
Specify Molecules in System¶
Before any predictions can be done, a list of participating molecules (identified via their iupac names) must be added to components, similar to how Aspen operates. components will then gather features for these molecules and streams will check input molecules against this list to make sure they've been declared. Under the hood, components uses the packages thermo & rdkit to gather additional identifiers like iupac names, smiles strings, and CAS # and also molecular features.
components.set_components(['acetone', '2-butanone'])
Sometimes thermo might mis-identify the input molecules, so it's good practice to check by printing components like so.
components
name iupac smiles cas 0 acetone propan-2-one CC(=O)C 67-64-1 1 2-butanone butan-2-one CCC(=O)C 78-93-3
Create Streams¶
Streams can be lazily initiliazed without any parameters. But they need to be called to put togther data for ML. A few notes on how to specify data for streams:
- Information on molecules & flowrates can only specified during call, not initialization.
- Molecules & flowrates have to be in the form of dictionaries (see code).
- All other parameters can be specified during initiliazaiton & call.
- If the same parameter is specified during initilization and call, it will be overwritten with call.
molecules = {"name_A": ["acetone"], "name_B": ["2-butanone"]}
flowrates = {"flowrate_A": [0.5], "flowrate_B": [0.5]}
feed_stream_1 = MaterialStream(vapor_fraction=0, pressure=2)(molecules=molecules, flowrates=flowrates)
feed_stream_2 = MaterialStream()(molecules=molecules, flowrates=flowrates, vapor_fraction=0, pressure=2)
Data used for ML held by a stream is stored in the data attribute.
feed_stream_1.data
| name_A | name_B | smiles_A | iupac_A | cas_A | MaxEStateIndex_A | MinEStateIndex_A | qed_A | MolWt_A | HeavyAtomMolWt_A | NumValenceElectrons_A | MaxPartialCharge_A | MinPartialCharge_A | FpDensityMorgan1_A | FpDensityMorgan2_A | FpDensityMorgan3_A | BCUT2D_MWHI_A | BCUT2D_MWLOW_A | BCUT2D_CHGHI_A | BCUT2D_CHGLO_A | BCUT2D_LOGPHI_A | BCUT2D_LOGPLOW_A | BCUT2D_MRHI_A | BCUT2D_MRLOW_A | BalabanJ_A | BertzCT_A | Chi0_A | Chi0n_A | Chi0v_A | Chi1_A | Chi1n_A | Chi1v_A | Chi2n_A | Chi2v_A | Chi3n_A | Chi3v_A | Chi4n_A | Chi4v_A | HallKierAlpha_A | Ipc_A | Kappa1_A | Kappa2_A | Kappa3_A | LabuteASA_A | PEOE_VSA1_A | PEOE_VSA10_A | PEOE_VSA14_A | PEOE_VSA2_A | PEOE_VSA4_A | PEOE_VSA6_A | PEOE_VSA7_A | PEOE_VSA8_A | SMR_VSA1_A | SMR_VSA10_A | SMR_VSA5_A | SlogP_VSA2_A | SlogP_VSA3_A | SlogP_VSA5_A | TPSA_A | EState_VSA10_A | EState_VSA2_A | EState_VSA3_A | EState_VSA4_A | EState_VSA5_A | EState_VSA6_A | EState_VSA7_A | EState_VSA8_A | EState_VSA9_A | VSA_EState2_A | VSA_EState5_A | VSA_EState7_A | VSA_EState8_A | FractionCSP3_A | HeavyAtomCount_A | NHOHCount_A | NOCount_A | NumHAcceptors_A | NumHeteroatoms_A | NumRotatableBonds_A | MolLogP_A | MolMR_A | fr_C_O_A | fr_C_O_noCOO_A | fr_ketone_A | fr_ketone_Topliss_A | fr_unbrch_alkane_A | tm_A | tb_A | tc_A | pc_A | vc_A | z_A | rhoc_A | acenttric_factor_A | triple_temp_A | triple_pres_A | heat_vaporization_A | heat_fusion_A | stockmayer_param_A | solubility_param_A | parachor_A | smiles_B | iupac_B | cas_B | MaxEStateIndex_B | MinEStateIndex_B | qed_B | MolWt_B | HeavyAtomMolWt_B | NumValenceElectrons_B | MaxPartialCharge_B | MinPartialCharge_B | FpDensityMorgan1_B | FpDensityMorgan2_B | FpDensityMorgan3_B | BCUT2D_MWHI_B | BCUT2D_MWLOW_B | BCUT2D_CHGHI_B | BCUT2D_CHGLO_B | BCUT2D_LOGPHI_B | BCUT2D_LOGPLOW_B | BCUT2D_MRHI_B | BCUT2D_MRLOW_B | BalabanJ_B | BertzCT_B | Chi0_B | Chi0n_B | Chi0v_B | Chi1_B | Chi1n_B | Chi1v_B | Chi2n_B | Chi2v_B | Chi3n_B | Chi3v_B | Chi4n_B | Chi4v_B | HallKierAlpha_B | Ipc_B | Kappa1_B | Kappa2_B | Kappa3_B | LabuteASA_B | PEOE_VSA1_B | PEOE_VSA10_B | PEOE_VSA14_B | PEOE_VSA2_B | PEOE_VSA4_B | PEOE_VSA6_B | PEOE_VSA7_B | PEOE_VSA8_B | SMR_VSA1_B | SMR_VSA10_B | SMR_VSA5_B | SlogP_VSA2_B | SlogP_VSA3_B | SlogP_VSA5_B | TPSA_B | EState_VSA10_B | EState_VSA2_B | EState_VSA3_B | EState_VSA4_B | EState_VSA5_B | EState_VSA6_B | EState_VSA7_B | EState_VSA8_B | EState_VSA9_B | VSA_EState2_B | VSA_EState5_B | VSA_EState7_B | VSA_EState8_B | FractionCSP3_B | HeavyAtomCount_B | NHOHCount_B | NOCount_B | NumHAcceptors_B | NumHeteroatoms_B | NumRotatableBonds_B | MolLogP_B | MolMR_B | fr_C_O_B | fr_C_O_noCOO_B | fr_ketone_B | fr_ketone_Topliss_B | fr_unbrch_alkane_B | tm_B | tb_B | tc_B | pc_B | vc_B | z_B | rhoc_B | acenttric_factor_B | triple_temp_B | triple_pres_B | heat_vaporization_B | heat_fusion_B | stockmayer_param_B | solubility_param_B | parachor_B | feed0_flowrate_A | feed0_flowrate_B | feed0_temperature | feed0_vapor_fraction | feed0_pressure | feed0_comp_no | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | acetone | 2-butanone | CC(=O)C | propan-2-one | 67-64-1 | 9.444444 | 0.166667 | 0.398237 | 58.08 | 52.032 | 24 | 0.126268 | -0.300344 | 1.5 | 1.5 | 1.5 | 16.136528 | 10.550822 | 1.619125 | -1.557836 | 1.500722 | -1.691321 | 5.717069 | -0.114493 | 2.803039 | 26.264663 | 3.57735 | 2.908248 | 2.908248 | 1.732051 | 1.204124 | 1.204124 | 0.908248 | 0.908248 | 0.0 | 0.0 | 0.0 | 0.0 | -0.33 | 3.245112 | 3.67 | 1.044532 | 6.883958 | 25.630657 | 4.794537 | 5.783245 | 0.0 | 0.0 | 0.0 | 0.0 | 13.847474 | 0.0 | 4.794537 | 5.783245 | 13.847474 | 5.783245 | 4.794537 | 13.847474 | 17.07 | 4.794537 | 5.783245 | 0.0 | 0.0 | 13.847474 | 0.0 | 0.0 | 0.0 | 0.0 | 9.444444 | 0.166667 | 0.0 | 3.055556 | 0.666667 | 4 | 0 | 1 | 1 | 1 | 0 | 0.5953 | 16.355 | 1 | 1 | 1 | 1 | 0 | 178.35 | 329.23 | 508.1 | 4700000.0 | 0.000213 | 0.23697 | 272.672019 | 0.309 | 176.732207 | 3.047365 | 29122.034793 | 5770.0 | 332.97 | 19638.941183 | 0.000029 | CCC(=O)C | butan-2-one | 78-93-3 | 9.8125 | 0.25463 | 0.451051 | 72.107 | 64.043 | 30 | 0.129065 | -0.300042 | 1.8 | 2.0 | 2.0 | 16.137114 | 10.3607 | 1.764362 | -1.711045 | 1.710825 | -1.797012 | 5.744225 | -0.116199 | 2.847379 | 38.912609 | 4.284457 | 3.615355 | 3.615355 | 2.270056 | 1.764784 | 1.764784 | 1.055568 | 1.055568 | 0.497891 | 0.497891 | 0.0 | 0.0 | -0.33 | 9.651484 | 4.67 | 1.94248 | 3.67 | 31.995599 | 4.794537 | 5.783245 | 0.0 | 0.0 | 0.0 | 6.923737 | 6.923737 | 6.420822 | 4.794537 | 5.783245 | 20.268296 | 5.783245 | 4.794537 | 20.268296 | 17.07 | 4.794537 | 5.783245 | 6.420822 | 0.0 | 0.0 | 6.923737 | 6.923737 | 0.0 | 0.0 | 9.8125 | 0.25463 | 0.666667 | 3.43287 | 0.75 | 5 | 0 | 1 | 1 | 1 | 1 | 0.9854 | 20.972 | 1 | 1 | 1 | 1 | 0 | 186.35 | 352.75 | 536.7 | 4207000.0 | 0.000267 | 0.25172 | 270.058876 | 0.329 | 186.46881 | 1.735133 | 31555.836658 | 8390.0 | 415.48 | 18879.750645 | 0.000036 | 0.5 | 0.5 | None | 0.0 | 2.0 | 2 |
Create Columns¶
Like streams, columns can be lazily initialized but has to be called to produce predictions. A few notes on how to specify data for columns:
streamscan only be fed into columns during call.- All other parameters can be specified during initilization or call.
- If the same parameter is specified during initilization and call, it will be overwritten with call.
dist_col_1 = Distillation(reflux_ratio=0.1, boilup_ratio=0.1, pressure=2, no_stages=4, feed_stage=2)
dist_col_2 = Distillation()
bott_1, dist_1 = dist_col_1(feed_stream_1)
bott_2, dist_2 = dist_col_2(reflux_ratio=0.1, boilup_ratio=0.1, pressure=2, no_stages=4, feed_stage=2, feed_stream=feed_stream_2)
Inspect Results¶
Columns will return stream objects filled with ML results and also fill itself with ML results. The types & number of stream objects, as well as the results available via ML depend on the type of column & type of data specified.
bott_1.flow
| flowrate_A | flowrate_B | |
|---|---|---|
| 0 | 0.443523 | 0.476227 |
bott_2.temperature
[88.45621634567827]
dist_col_1.condensor_duty
[-1682.6839428518751]