### Analyze dataset

The easiest way to use _automea_ is to set up a _csv_ file with a list of datasets and corresponding wells we want to analyze. The method _analyze_dataset()_ takes as input a csv file and, based on parameters defined by the user, saves different output like statistics for each dataset, a list of spikes and bursts for each well, and more.

As an explample we have prepared the file 'filenames_and_wells.csv'. Let's load it with _pandas_ to take a look at it.


```python
import pandas as pd
csv_file = '../automea/final/filenames_and_wells_2.csv'
filenames_and_wells = pd.read_csv(csv_file, sep = ';')
filenames_and_wells
```




<div>
<style scoped>
    .dataframe tbody tr th:only-of-type {
        vertical-align: middle;
    }

    .dataframe tbody tr th {
        vertical-align: top;
    }

    .dataframe thead th {
        text-align: right;
    }
</style>
<table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th></th>
      <th>filename</th>
      <th>wells</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th>0</th>
      <td>AH22C001_3039_DIV14.h5</td>
      <td>A6</td>
    </tr>
  </tbody>
</table>
</div>



The file contain the dataset named 'AH22C001_3039_DIV14.h5', and one well (A6).


```python
import os
os.chdir('../automea/final/')
```

To perform the analysis we first need to define a set of parameters. Let's start by importing _automea_ and creating an object from the _Analysis()_ class.


```python
import automea
```


```python
am = automea.Analysis()
```

One of the user defined parameters is which model one wants to use to perform the analysis - the machine learning models used for burst detection.

If the user wants to use one of the pretrained models available with the package, it's only necessary to define the _model_name_. After this the method _loadmodel()_ can be called, and the chosen model will be used for following analyses.


```python
am.model_name = 'signal30.h5'
am.loadmodel()
```

We can check the model parameters, used for burst detection, by looking the at _model_params_ attribute.


```python
am.model_params
```




    {'name': 'signal30.h5',
     'input_type': 'signal',
     'input_average': 30,
     'window_size': 50000,
     'window_overlap': 25000}



With the model loaded, we can define what we can to save from the analysis, by changing the _analysis_params_ attribute.


```python
am.analysis_params
```




    {'save_spikes': False,
     'save_reverbs': False,
     'save_bursts': False,
     'save_net_reverbs': False,
     'save_net_bursts': False,
     'save_stats': False}



For instance, if we want to save only the high-level statistics, we set


```python
am.analysis_params['save_stats'] = True
```

Now, we need to indicate where to find the csv file containing the datasets we want to analyze, the location of the datasets, and define a name for the output files.


```python
am.path_to_csv = ''
csv_file = 'filenames_and_wells_2.csv'
am.path_to_dataset = '../../qneuron/mea-reproduce-results/mea-h5-datafiles/'
am.output_name = 'test'

```

Finally, we can run the _analyze_dataset_ function for a csv file.


```python
am.analyze_dataset(csv_file)
```

    --- Running full analysis ---
    
    --- Using model  signal30.h5  ---
    
    
     Analyzing dataset:  AH22C001_3039_DIV14.h5 
    
    Well:  A6
    
    --- Done! ---


While the code is running, a message shows which datasets and well are currently being analyzed.

After some minutes - or hours depending on how many datasets we want to analyze - a *Done!* message is shows indidcating that the process is finished.

The statistics we saved can be found in the file named 'test_STATS_PREDICTED.csv'. We can import the file and look at the results.


```python
stats = pd.read_csv('test_STATS_PREDICTED.csv')
stats
```




<div>
<style scoped>
    .dataframe tbody tr th:only-of-type {
        vertical-align: middle;
    }

    .dataframe tbody tr th {
        vertical-align: top;
    }

    .dataframe thead th {
        text-align: right;
    }
</style>
<table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th></th>
      <th>Filename</th>
      <th>Well Label</th>
      <th>Number of channels</th>
      <th>Total number of spikes</th>
      <th>Mean Firing Rate [Hz]</th>
      <th>Stray spikes (%)</th>
      <th>Total number of networks bursts</th>
      <th>Mean Network Bursting Rate [bursts/minute]</th>
      <th>Mean Network Burst Duration [ms]</th>
      <th>NIBI</th>
      <th>CV of NIBI</th>
      <th>Mean reverb per burst</th>
      <th>Median of reverb per burst</th>
      <th>Mean net reverb per net burst</th>
      <th>Median of net reverb per net burst</th>
      <th>Total number of network reverb</th>
      <th>Mean net reverb frequency [reverb/min]</th>
      <th>Mean net reverb duration [ms]</th>
      <th>Mean in-reverb freq [Hz]</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th>0</th>
      <td>AH22C001_3039_DIV14.h5</td>
      <td>A6</td>
      <td>12</td>
      <td>56193</td>
      <td>7.8</td>
      <td>11.92</td>
      <td>49</td>
      <td>4.9</td>
      <td>759.41</td>
      <td>11199.72</td>
      <td>0.27</td>
      <td>4.84</td>
      <td>5.0</td>
      <td>4.73</td>
      <td>5.0</td>
      <td>237</td>
      <td>23.7</td>
      <td>91.77</td>
      <td>189.31</td>
    </tr>
  </tbody>
</table>
</div>



The file contains statistics about the dataset/well analyzed.

