I'm going to add a new param to the CODA environments called RunNumber, and I'm going to use that (combined with the MDP number) to generate the random seed for transition and reward stochasticity. This way, for every MDP, you *can* control these feature independently. The only problem is that I have just run 600 000 experiments without this parameter so I need to re-run all of those experiments. Shux.
One would be to have each run as it's own experiment. For example, we could add a runNumber param to the experiment's paramSummary. This would later allow us in SQL to have fine control over the runs because we could do queries over specific run numbers. The downside of this approach is that there would be a different experimentId for each run, which means queries directly over the resultsRecords might be more complicated. If they shared the experimentId, we could do:
select max(score) from resultRecords where experimentId=select id from Experiment where MDPNumber=2
Now we'd have to do
select max(score) from resultRecords where experimentId in (select id from Experiment where MDPNumber=2)
Maybe that's not so bad. Maybe it gets worse when we have more complicated queries. Not quite sure.
The opposite strategy is to take the distinct index off of ResultRecords for (AgentId,ExperimentId). This would mean that there would literally just be multiple ResultRecords for each Agent,Experiment combination. We'd have to manually filter or average them when necessary. The advantages can be that we just need to submit things multiple times and they get done multiple times and end up in our results multiple times.
Now that I'm thinking more clearly about this, I don't think this strategy fits with the new accountability strategy that I'm trying to follow where we have a record of each submission. I think we should add the runNumber to the param summary.
Status Blog >