:py:mod:`rhino_health.lib.endpoints.dataset.dataset_dataclass`
==============================================================

.. py:module:: rhino_health.lib.endpoints.dataset.dataset_dataclass


Module Contents
---------------

Classes
~~~~~~~

.. autoapisummary::

   rhino_health.lib.endpoints.dataset.dataset_dataclass.DatasetCreateInput
   rhino_health.lib.endpoints.dataset.dataset_dataclass.Dataset


.. py:class:: DatasetCreateInput(**data)


   Bases: :py:obj:`BaseDataset`

   
   Input arguments for adding a new Dataset


   ..
       !! processed by numpydoc !!
   .. py:attribute:: csv_filesystem_location
      :type: Optional[str]

      
       The location the Dataset data is located on-prem. The file should be a CSV.


      ..
          !! processed by numpydoc !!

   .. py:attribute:: method
      :type: typing_extensions.Literal[DICOM, filesystem]

      
       What source are we importing imaging data from. Either a DICOM server, or the local file system


      ..
          !! processed by numpydoc !!

   .. py:attribute:: is_data_deidentified
      :type: Optional[bool]
      :value: False

      
       Is the data already deidentified?


      ..
          !! processed by numpydoc !!

   .. py:attribute:: image_dicom_server
      :type: Optional[str]

      
       The DICOM Server URL to import DICOM images from


      ..
          !! processed by numpydoc !!

   .. py:attribute:: image_filesystem_location
      :type: Optional[str]

      
       The on-prem Location to import DICOM images from


      ..
          !! processed by numpydoc !!

   .. py:attribute:: file_base_path
      :type: Optional[str]

      
       The location of non DICOM files listed in the dataset data CSV on-prem


      ..
          !! processed by numpydoc !!

   .. py:attribute:: sync
      :type: Optional[bool]
      :value: True

      
       Should we perform this import request synchronously.


      ..
          !! processed by numpydoc !!

   .. py:attribute:: name
      :type: str

      
       The name of the Dataset


      ..
          !! processed by numpydoc !!

   .. py:attribute:: description
      :type: str

      
       The description of the Dataset


      ..
          !! processed by numpydoc !!

   .. py:attribute:: base_version_uid
      :type: Optional[str]

      
       The original Dataset this Dataset is a new version of, if applicable


      ..
          !! processed by numpydoc !!

   .. py:attribute:: project_uid
      :type: typing_extensions.Annotated[str, Field(alias='project')]

      
       The unique ID of the Project this Dataset belongs to.


      ..
          !! processed by numpydoc !!

   .. py:attribute:: workgroup_uid
      :type: typing_extensions.Annotated[str, Field(alias='workgroup')]

      
      The unique ID of the Workgroup this Dataset belongs to
      .. warning workgroup_uid may change to primary_workgroup_uid in the future


      ..
          !! processed by numpydoc !!

   .. py:attribute:: data_schema_uid
      :type: typing_extensions.Annotated[Any, Field(alias='data_schema')]

      
       The unique ID of the DataSchema this Dataset follows


      ..
          !! processed by numpydoc !!

   .. py:method:: import_args()


.. py:class:: Dataset(**data)


   ..
       !! processed by numpydoc !!
   .. py:property:: dataset_info

      
       Sanitized metadata information about the Dataset.


      ..
          !! processed by numpydoc !!

   .. py:property:: data_schema
      :type: DataSchema

      
      Return the DataSchema Dataclass associated with data_schema_uid

      .. warning:: The result of this function is cached.
          Be careful calling this function after making changes.
          All dataclasses must already exist on the platform before making this call.

      :Returns:

          data_schema: DataSchema
              Dataclass representing the DataSchema


   .. py:property:: project
      :type: Project

      
      Return the Project Dataclass associated with project_uid

      .. warning:: The result of this function is cached.
          Be careful calling this function after making changes.
          All dataclasses must already exist on the platform before making this call.

      :Returns:

          project: Project
              Dataclass representing the Project


   .. py:property:: workgroup
      :type: Workgroup

      
      Return the Workgroup Dataclass associated with workgroup_uid

      .. warning:: The result of this function is cached.
          Be careful calling this function after making changes.
          All dataclasses must already exist on the platform before making this call.

      :Returns:

          workgroup: Workgroup
              Dataclass representing the Workgroup


   .. py:property:: creator
      :type: User

      
      Return the User Dataclass associated with creator_uid

      .. warning:: The result of this function is cached.
          Be careful calling this function after making changes.
          All dataclasses must already exist on the platform before making this call.

      :Returns:

          creator: User
              Dataclass representing the User


   .. py:attribute:: data_schema_uid
      :type: str

      
   .. py:attribute:: uid
      :type: str

      
       The unique ID of the Dataset


      ..
          !! processed by numpydoc !!

   .. py:attribute:: version
      :type: Optional[int]
      :value: 0

      
       Which revision this Dataset is


      ..
          !! processed by numpydoc !!

   .. py:attribute:: num_cases
      :type: int

      
       The number of cases in the Dataset


      ..
          !! processed by numpydoc !!

   .. py:attribute:: import_status
      :type: str

      
       The import status of the Dataset


      ..
          !! processed by numpydoc !!

   .. py:attribute:: name
      :type: str

      
       The name of the Dataset


      ..
          !! processed by numpydoc !!

   .. py:attribute:: description
      :type: str

      
       The description of the Dataset


      ..
          !! processed by numpydoc !!

   .. py:attribute:: base_version_uid
      :type: Optional[str]

      
       The original Dataset this Dataset is a new version of, if applicable


      ..
          !! processed by numpydoc !!

   .. py:attribute:: creator_uid
      :type: str

      
      The UID of the creator of this dataclass on the system


      ..
          !! processed by numpydoc !!

   .. py:attribute:: created_at
      :type: str

      
      When this dataclass was created on the system


      ..
          !! processed by numpydoc !!

   .. py:attribute:: data_schema_name
      :type: str

      The data_schema name

   .. py:attribute:: project_name
      :type: str

      The project name

   .. py:attribute:: workgroup_name
      :type: str

      The workgroup name

   .. py:attribute:: creator_name
      :type: str

      The creator name

   .. py:method:: get_metric(metric_configuration: rhino_health.lib.metrics.base_metric.BaseMetric)

      
      Queries on-prem and returns the result based on the METRIC_CONFIGURATION for this Dataset.


      .. seealso::

          
          :obj:`rhino_health.lib.endpoints.dataset.dataset_endpoints.DatasetEndpoints.get_dataset_metric`
              Full documentation
          
          
      ..
          !! processed by numpydoc !!

   .. py:method:: run_code(run_code, print_progress=True, **kwargs)

      
      Create and run code on this dataset using defaults that can be overridden

      .. warning:: This function relies on a dataset's metadata so make sure to create the input dataset first
      .. warning:: This feature is under development and the interface may change

      run_code: str
          The code that will run in the container
      print_progress: bool = True
          Whether to print how long has elapsed since the start of the wait
      name: Optional[str] = "{dataset.name} (v.{dataset.version}) containerless code"
          Model name - Uses the dataset name and version as part of the default
          (eg: when using a the first version of dataset named dataset_one the name will be dataset_one (v.1) containerless code)
      description: Optional[str] = "Python code run"
          Model description
      container_image_uri: Optional[str] = {ENV_URL}/rhino-gc-workgroup-rhino-health:generic-python-runner"
          Uri to container that should be run - ENV_URL is the environment ecr repo url
      input_data_schema_uid: Optional[str] = dataset.data_schema_uid
          The data_schema used for the input dataset - By default uses the data_schema used to import the dataset
      output_data_schema_uid: Optional[str] = None (Auto generate data schema)
          The data_schema used for the output dataset - By default generates a schema from the dataset_csv
      output_dataset_names_suffix: Optional[str] = "containerless code"
          String that will be added to output dataset name
      timeout_seconds: Optional[int] = 600
          Amount of time before timeout in seconds


      :Returns:

          Tuple: (output_datasets, code_run)
              output_datasets: List of Dataset Dataclasses
              code_run: A CodeRun object containing the run outcome


      .. rubric:: Examples

      dataset.run_code(run_code = <df['BMI'] = df.Weight / (df.Height ** 2)>)


      ..
          !! processed by numpydoc !!