Making A Single Prediction¶
from ml4pd import components
from ml4pd.streams import MaterialStream
from ml4pd.aspen_units import Distillation
Specify Molecules in System¶
Before any predictions can be done, a list of participating molecules (identified via their iupac names) must be added to components
, similar to how Aspen operates. components
will then gather features for these molecules and streams will check input molecules against this list to make sure they've been declared. Under the hood, components
uses the packages thermo
& rdkit to gather additional identifiers like iupac names, smiles strings, and CAS # and also molecular features.
components.set_components(['acetone', '2-butanone'])
Sometimes thermo
might mis-identify the input molecules, so it's good practice to check by printing components
like so.
components
name iupac smiles cas 0 acetone propan-2-one CC(=O)C 67-64-1 1 2-butanone butan-2-one CCC(=O)C 78-93-3
Create Streams¶
Streams can be lazily initiliazed without any parameters. But they need to be called to put togther data for ML. A few notes on how to specify data for streams:
- Information on molecules & flowrates can only specified during call, not initialization.
- Molecules & flowrates have to be in the form of dictionaries (see code).
- All other parameters can be specified during initiliazaiton & call.
- If the same parameter is specified during initilization and call, it will be overwritten with call.
molecules = {"name_A": ["acetone"], "name_B": ["2-butanone"]}
flowrates = {"flowrate_A": [0.5], "flowrate_B": [0.5]}
feed_stream_1 = MaterialStream(vapor_fraction=0, pressure=2)(molecules=molecules, flowrates=flowrates)
feed_stream_2 = MaterialStream()(molecules=molecules, flowrates=flowrates, vapor_fraction=0, pressure=2)
Data used for ML held by a stream
is stored in the data
attribute.
feed_stream_1.data
name_A | name_B | smiles_A | iupac_A | cas_A | MaxEStateIndex_A | MinEStateIndex_A | qed_A | MolWt_A | HeavyAtomMolWt_A | NumValenceElectrons_A | MaxPartialCharge_A | MinPartialCharge_A | FpDensityMorgan1_A | FpDensityMorgan2_A | FpDensityMorgan3_A | BCUT2D_MWHI_A | BCUT2D_MWLOW_A | BCUT2D_CHGHI_A | BCUT2D_CHGLO_A | BCUT2D_LOGPHI_A | BCUT2D_LOGPLOW_A | BCUT2D_MRHI_A | BCUT2D_MRLOW_A | BalabanJ_A | BertzCT_A | Chi0_A | Chi0n_A | Chi0v_A | Chi1_A | Chi1n_A | Chi1v_A | Chi2n_A | Chi2v_A | Chi3n_A | Chi3v_A | Chi4n_A | Chi4v_A | HallKierAlpha_A | Ipc_A | Kappa1_A | Kappa2_A | Kappa3_A | LabuteASA_A | PEOE_VSA1_A | PEOE_VSA10_A | PEOE_VSA14_A | PEOE_VSA2_A | PEOE_VSA4_A | PEOE_VSA6_A | PEOE_VSA7_A | PEOE_VSA8_A | SMR_VSA1_A | SMR_VSA10_A | SMR_VSA5_A | SlogP_VSA2_A | SlogP_VSA3_A | SlogP_VSA5_A | TPSA_A | EState_VSA10_A | EState_VSA2_A | EState_VSA3_A | EState_VSA4_A | EState_VSA5_A | EState_VSA6_A | EState_VSA7_A | EState_VSA8_A | EState_VSA9_A | VSA_EState2_A | VSA_EState5_A | VSA_EState7_A | VSA_EState8_A | FractionCSP3_A | HeavyAtomCount_A | NHOHCount_A | NOCount_A | NumHAcceptors_A | NumHeteroatoms_A | NumRotatableBonds_A | MolLogP_A | MolMR_A | fr_C_O_A | fr_C_O_noCOO_A | fr_ketone_A | fr_ketone_Topliss_A | fr_unbrch_alkane_A | tm_A | tb_A | tc_A | pc_A | vc_A | z_A | rhoc_A | acenttric_factor_A | triple_temp_A | triple_pres_A | heat_vaporization_A | heat_fusion_A | stockmayer_param_A | solubility_param_A | parachor_A | smiles_B | iupac_B | cas_B | MaxEStateIndex_B | MinEStateIndex_B | qed_B | MolWt_B | HeavyAtomMolWt_B | NumValenceElectrons_B | MaxPartialCharge_B | MinPartialCharge_B | FpDensityMorgan1_B | FpDensityMorgan2_B | FpDensityMorgan3_B | BCUT2D_MWHI_B | BCUT2D_MWLOW_B | BCUT2D_CHGHI_B | BCUT2D_CHGLO_B | BCUT2D_LOGPHI_B | BCUT2D_LOGPLOW_B | BCUT2D_MRHI_B | BCUT2D_MRLOW_B | BalabanJ_B | BertzCT_B | Chi0_B | Chi0n_B | Chi0v_B | Chi1_B | Chi1n_B | Chi1v_B | Chi2n_B | Chi2v_B | Chi3n_B | Chi3v_B | Chi4n_B | Chi4v_B | HallKierAlpha_B | Ipc_B | Kappa1_B | Kappa2_B | Kappa3_B | LabuteASA_B | PEOE_VSA1_B | PEOE_VSA10_B | PEOE_VSA14_B | PEOE_VSA2_B | PEOE_VSA4_B | PEOE_VSA6_B | PEOE_VSA7_B | PEOE_VSA8_B | SMR_VSA1_B | SMR_VSA10_B | SMR_VSA5_B | SlogP_VSA2_B | SlogP_VSA3_B | SlogP_VSA5_B | TPSA_B | EState_VSA10_B | EState_VSA2_B | EState_VSA3_B | EState_VSA4_B | EState_VSA5_B | EState_VSA6_B | EState_VSA7_B | EState_VSA8_B | EState_VSA9_B | VSA_EState2_B | VSA_EState5_B | VSA_EState7_B | VSA_EState8_B | FractionCSP3_B | HeavyAtomCount_B | NHOHCount_B | NOCount_B | NumHAcceptors_B | NumHeteroatoms_B | NumRotatableBonds_B | MolLogP_B | MolMR_B | fr_C_O_B | fr_C_O_noCOO_B | fr_ketone_B | fr_ketone_Topliss_B | fr_unbrch_alkane_B | tm_B | tb_B | tc_B | pc_B | vc_B | z_B | rhoc_B | acenttric_factor_B | triple_temp_B | triple_pres_B | heat_vaporization_B | heat_fusion_B | stockmayer_param_B | solubility_param_B | parachor_B | feed0_flowrate_A | feed0_flowrate_B | feed0_temperature | feed0_vapor_fraction | feed0_pressure | feed0_comp_no | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | acetone | 2-butanone | CC(=O)C | propan-2-one | 67-64-1 | 9.444444 | 0.166667 | 0.398237 | 58.08 | 52.032 | 24 | 0.126268 | -0.300344 | 1.5 | 1.5 | 1.5 | 16.136528 | 10.550822 | 1.619125 | -1.557836 | 1.500722 | -1.691321 | 5.717069 | -0.114493 | 2.803039 | 26.264663 | 3.57735 | 2.908248 | 2.908248 | 1.732051 | 1.204124 | 1.204124 | 0.908248 | 0.908248 | 0.0 | 0.0 | 0.0 | 0.0 | -0.33 | 3.245112 | 3.67 | 1.044532 | 6.883958 | 25.630657 | 4.794537 | 5.783245 | 0.0 | 0.0 | 0.0 | 0.0 | 13.847474 | 0.0 | 4.794537 | 5.783245 | 13.847474 | 5.783245 | 4.794537 | 13.847474 | 17.07 | 4.794537 | 5.783245 | 0.0 | 0.0 | 13.847474 | 0.0 | 0.0 | 0.0 | 0.0 | 9.444444 | 0.166667 | 0.0 | 3.055556 | 0.666667 | 4 | 0 | 1 | 1 | 1 | 0 | 0.5953 | 16.355 | 1 | 1 | 1 | 1 | 0 | 178.35 | 329.23 | 508.1 | 4700000.0 | 0.000213 | 0.23697 | 272.672019 | 0.309 | 176.732207 | 3.047365 | 29122.034793 | 5770.0 | 332.97 | 19638.941183 | 0.000029 | CCC(=O)C | butan-2-one | 78-93-3 | 9.8125 | 0.25463 | 0.451051 | 72.107 | 64.043 | 30 | 0.129065 | -0.300042 | 1.8 | 2.0 | 2.0 | 16.137114 | 10.3607 | 1.764362 | -1.711045 | 1.710825 | -1.797012 | 5.744225 | -0.116199 | 2.847379 | 38.912609 | 4.284457 | 3.615355 | 3.615355 | 2.270056 | 1.764784 | 1.764784 | 1.055568 | 1.055568 | 0.497891 | 0.497891 | 0.0 | 0.0 | -0.33 | 9.651484 | 4.67 | 1.94248 | 3.67 | 31.995599 | 4.794537 | 5.783245 | 0.0 | 0.0 | 0.0 | 6.923737 | 6.923737 | 6.420822 | 4.794537 | 5.783245 | 20.268296 | 5.783245 | 4.794537 | 20.268296 | 17.07 | 4.794537 | 5.783245 | 6.420822 | 0.0 | 0.0 | 6.923737 | 6.923737 | 0.0 | 0.0 | 9.8125 | 0.25463 | 0.666667 | 3.43287 | 0.75 | 5 | 0 | 1 | 1 | 1 | 1 | 0.9854 | 20.972 | 1 | 1 | 1 | 1 | 0 | 186.35 | 352.75 | 536.7 | 4207000.0 | 0.000267 | 0.25172 | 270.058876 | 0.329 | 186.46881 | 1.735133 | 31555.836658 | 8390.0 | 415.48 | 18879.750645 | 0.000036 | 0.5 | 0.5 | None | 0.0 | 2.0 | 2 |
Create Columns¶
Like streams, columns can be lazily initialized but has to be called to produce predictions. A few notes on how to specify data for columns:
streams
can only be fed into columns during call.- All other parameters can be specified during initilization or call.
- If the same parameter is specified during initilization and call, it will be overwritten with call.
dist_col_1 = Distillation(reflux_ratio=0.1, boilup_ratio=0.1, pressure=2, no_stages=4, feed_stage=2)
dist_col_2 = Distillation()
bott_1, dist_1 = dist_col_1(feed_stream_1)
bott_2, dist_2 = dist_col_2(reflux_ratio=0.1, boilup_ratio=0.1, pressure=2, no_stages=4, feed_stage=2, feed_stream=feed_stream_2)
Inspect Results¶
Columns will return stream objects filled with ML results and also fill itself with ML results. The types & number of stream objects, as well as the results available via ML depend on the type of column & type of data specified.
bott_1.flow
flowrate_A | flowrate_B | |
---|---|---|
0 | 0.443523 | 0.476227 |
bott_2.temperature
[88.45621634567827]
dist_col_1.condensor_duty
[-1682.6839428518751]