read_datasets#
- moocore.read_datasets(filename)[source]#
- Read an input dataset file, parsing the file and returning a numpy array. - Parameters:
- filename ( - str|- PathLike|- StringIO) – Filename of the dataset file or- io.StringIOdirectly containing the file contents. If it does not contain an absolute path, the filename is relative to the current working directory. If the filename has extension- .xz, it is decompressed to a temporary file before reading it. Each line of the file corresponds to one point of one dataset. Different datasets are separated by an empty line.
- Returns:
- ndarray– An array containing a representation of the data in the file. The first \(n-1\) columns contain the numerical data for each of the objectives. The last column contains an identifier for which set the data is relevant to.
 - Examples - >>> filename = moocore.get_dataset_path("input1.dat") >>> moocore.read_datasets(filename) array([[ 8.07559653, 2.40702554, 1. ], [ 8.66094446, 3.64050144, 1. ], [ 0.20816431, 4.62275469, 1. ], ... [ 4.92599726, 2.70492519, 10. ], [ 1.22234394, 5.68950311, 10. ], [ 7.99466959, 2.81122537, 10. ], [ 2.12700289, 2.43114174, 10. ]]) - The numpy array represents this data: - Objective 1 - Objective 2 - Set Number - 8.07559653 - 2.40702554 - 8.66094446 - 3.64050144 - … - … - … - 7.99466959 - 2.81122537 - 2.12700289 - 2.43114174 - It is also possible to read datasets from a string: - >>> from io import StringIO >>> fh = StringIO("0.5 0.5\n\n1 0\n0 1\n\n0.5 0.5") >>> moocore.read_datasets(fh) array([[0.5, 0.5, 1. ], [1. , 0. , 2. ], [0. , 1. , 2. ], [0.5, 0.5, 3. ]])