Auto Image Classification
In [ ]:
Copied!
import os
import sys
import warnings
from pathlib import Path
warnings.filterwarnings("ignore")
os.chdir("../../")
import os
import sys
import warnings
from pathlib import Path
warnings.filterwarnings("ignore")
os.chdir("../../")
In [ ]:
Copied!
import ray
from flash.core.data.utils import download_data
from gradsflow import AutoImageClassifier
from gradsflow.data.image import image_dataset_from_directory
import ray
from flash.core.data.utils import download_data
from gradsflow import AutoImageClassifier
from gradsflow.data.image import image_dataset_from_directory
Let's use Hymenoptera
dataset provided by Flash which contain images of Ants and Bees for creating Image Classication Model.
In [ ]:
Copied!
data_dir = "/Users/aniket/personal/gradsflow/gradsflow/data/" # replace with your filepath
# download_data("https://pl-flash-data.s3.amazonaws.com/hymenoptera_data.zip", data_dir)
data_dir = "/Users/aniket/personal/gradsflow/gradsflow/data/" # replace with your filepath
# download_data("https://pl-flash-data.s3.amazonaws.com/hymenoptera_data.zip", data_dir)
In [ ]:
Copied!
train_data = image_dataset_from_directory(f"{data_dir}/hymenoptera_data/train/", transform=True)
train_dl = train_data.dataloader
val_data = image_dataset_from_directory(f"{data_dir}/hymenoptera_data/val/", transform=True)
val_dl = val_data.dataloader
train_data = image_dataset_from_directory(f"{data_dir}/hymenoptera_data/train/", transform=True)
train_dl = train_data.dataloader
val_data = image_dataset_from_directory(f"{data_dir}/hymenoptera_data/val/", transform=True)
val_dl = val_data.dataloader
If you want to run Gradsflow on a remote server then first setup ray cluster and initialize ray with the remote address.
In [ ]:
Copied!
# ray.init(address="REMOTE_IP_ADDR")
# ray.init(local_mode=True)
# ray.init(address="REMOTE_IP_ADDR")
# ray.init(local_mode=True)
To train an image classifier create an object of AutoImageClassifier
and provide number of trials and timeout.
In [ ]:
Copied!
model = AutoImageClassifier(
train_dataloader=train_dl,
val_dataloader=val_dl,
num_classes=2,
n_trials=1,
optimization_metric="train_loss",
timeout=50,
)
print("AutoImageClassifier initialised!")
model = AutoImageClassifier(
train_dataloader=train_dl,
val_dataloader=val_dl,
num_classes=2,
n_trials=1,
optimization_metric="train_loss",
timeout=50,
)
print("AutoImageClassifier initialised!")
AutoImageClassifier initialised!
In [ ]:
Copied!
analysis = model.hp_tune()
print("completed!")
analysis = model.hp_tune()
print("completed!")
2021-10-08 11:48:37,675 INFO services.py:1263 -- View the Ray dashboard at http://127.0.0.1:8265
2021-10-08 11:48:40,110 WARNING function_runner.py:558 -- Function checkpointing is disabled. This may result in unexpected behavior when using checkpointing features or certain schedulers. To enable, set the train function arguments to be `func(config, checkpoint_dir=None)`.
== Status ==
Memory usage on this node: 10.4/16.0 GiB
Using FIFO scheduling algorithm.
Resources requested: 0/8 CPUs, 0/0 GPUs, 0.0/4.54 GiB heap, 0.0/2.27 GiB objects
Result logdir: /Users/aniket/ray_results/optimization_objective_2021-10-08_11-48-40
Number of trials: 1/1 (1 PENDING)
Memory usage on this node: 10.4/16.0 GiB
Using FIFO scheduling algorithm.
Resources requested: 0/8 CPUs, 0/0 GPUs, 0.0/4.54 GiB heap, 0.0/2.27 GiB objects
Result logdir: /Users/aniket/ray_results/optimization_objective_2021-10-08_11-48-40
Number of trials: 1/1 (1 PENDING)
Trial name | status | loc | backbone | lr | optimizer |
---|---|---|---|---|---|
optimization_objective_94628_00000 | PENDING | ssl_resnet50 | 0.00387828 | sgd |
2021-10-08 11:52:31,654 WARNING tune.py:518 -- SIGINT received (e.g. via Ctrl+C), ending Ray Tune run. This will try to checkpoint the experiment state one last time. Press CTRL+C one more time (or send SIGINT/SIGKILL/SIGTERM) to skip. 2021-10-08 11:52:31,665 ERROR trial_runner.py:773 -- Trial optimization_objective_94628_00000: Error processing event. Traceback (most recent call last): File "/Users/aniket/miniconda3/envs/am/lib/python3.8/site-packages/ray/tune/trial_runner.py", line 739, in _process_trial results = self.trial_executor.fetch_result(trial) File "/Users/aniket/miniconda3/envs/am/lib/python3.8/site-packages/ray/tune/ray_trial_executor.py", line 746, in fetch_result result = ray.get(trial_future[0], timeout=DEFAULT_GET_TIMEOUT) File "/Users/aniket/miniconda3/envs/am/lib/python3.8/site-packages/ray/_private/client_mode_hook.py", line 82, in wrapper return func(*args, **kwargs) File "/Users/aniket/miniconda3/envs/am/lib/python3.8/site-packages/ray/worker.py", line 1621, in get raise value.as_instanceof_cause() ray.exceptions.RayTaskError: ray::ImplicitFunc.train_buffered() (pid=80657, ip=192.168.40.84, repr=<ray.tune.function_runner.ImplicitFunc object at 0x7fbb0274f4f0>) File "/Users/aniket/miniconda3/envs/am/lib/python3.8/site-packages/ray/tune/trainable.py", line 178, in train_buffered result = self.train() File "/Users/aniket/miniconda3/envs/am/lib/python3.8/site-packages/ray/tune/trainable.py", line 237, in train result = self.step() File "/Users/aniket/miniconda3/envs/am/lib/python3.8/site-packages/ray/tune/function_runner.py", line 361, in step result = self._results_queue.get( File "/Users/aniket/miniconda3/envs/am/lib/python3.8/queue.py", line 179, in get self.not_empty.wait(remaining) File "/Users/aniket/miniconda3/envs/am/lib/python3.8/threading.py", line 306, in wait gotit = waiter.acquire(True, timeout) KeyboardInterrupt During handling of the above exception, another exception occurred: ray::ImplicitFunc.train_buffered() (pid=80657, ip=192.168.40.84, repr=<ray.tune.function_runner.ImplicitFunc object at 0x7fbb0274f4f0>) ray.exceptions.TaskCancelledError: Task: TaskID(69a6825d641b46131cef336a2abb8d3ef89d181901000000) was cancelled 2021-10-08 11:52:31,685 INFO stopper.py:348 -- Reached timeout of 50 seconds. Stopping all trials.
Result for optimization_objective_94628_00000: {}
== Status ==
Memory usage on this node: 9.3/16.0 GiB
Using FIFO scheduling algorithm.
Resources requested: 0/8 CPUs, 0/0 GPUs, 0.0/4.54 GiB heap, 0.0/2.27 GiB objects
Result logdir: /Users/aniket/ray_results/optimization_objective_2021-10-08_11-48-40
Number of trials: 1/1 (1 ERROR)
Number of errored trials: 1
Memory usage on this node: 9.3/16.0 GiB
Using FIFO scheduling algorithm.
Resources requested: 0/8 CPUs, 0/0 GPUs, 0.0/4.54 GiB heap, 0.0/2.27 GiB objects
Result logdir: /Users/aniket/ray_results/optimization_objective_2021-10-08_11-48-40
Number of trials: 1/1 (1 ERROR)
Trial name | status | loc | backbone | lr | optimizer |
---|---|---|---|---|---|
optimization_objective_94628_00000 | ERROR | ssl_resnet50 | 0.00387828 | sgd |
Number of errored trials: 1
Trial name | # failures | error file |
---|---|---|
optimization_objective_94628_00000 | 1 | /Users/aniket/ray_results/optimization_objective_2021-10-08_11-48-40/optimization_objective_94628_00000_0_backbone=ssl_resnet50,lr=0.0038783,optimizer=sgd_2021-10-08_11-48-40/error.txt |
== Status ==
Memory usage on this node: 9.2/16.0 GiB
Using FIFO scheduling algorithm.
Resources requested: 0/8 CPUs, 0/0 GPUs, 0.0/4.54 GiB heap, 0.0/2.27 GiB objects
Result logdir: /Users/aniket/ray_results/optimization_objective_2021-10-08_11-48-40
Number of trials: 1/1 (1 ERROR)
Number of errored trials: 1
Memory usage on this node: 9.2/16.0 GiB
Using FIFO scheduling algorithm.
Resources requested: 0/8 CPUs, 0/0 GPUs, 0.0/4.54 GiB heap, 0.0/2.27 GiB objects
Result logdir: /Users/aniket/ray_results/optimization_objective_2021-10-08_11-48-40
Number of trials: 1/1 (1 ERROR)
Trial name | status | loc | backbone | lr | optimizer |
---|---|---|---|---|---|
optimization_objective_94628_00000 | ERROR | ssl_resnet50 | 0.00387828 | sgd |
Number of errored trials: 1
Trial name | # failures | error file |
---|---|---|
optimization_objective_94628_00000 | 1 | /Users/aniket/ray_results/optimization_objective_2021-10-08_11-48-40/optimization_objective_94628_00000_0_backbone=ssl_resnet50,lr=0.0038783,optimizer=sgd_2021-10-08_11-48-40/error.txt |
2021-10-08 11:52:31,819 ERROR tune.py:557 -- Trials did not complete: [optimization_objective_94628_00000] 2021-10-08 11:52:31,822 INFO tune.py:561 -- Total run time: 231.72 seconds (231.52 seconds for the tuning loop). 2021-10-08 11:52:31,824 WARNING tune.py:565 -- Experiment has been interrupted, but the most recent state was saved. You can continue running this experiment by passing `resume=True` to `tune.run()` 2021-10-08 11:52:31,837 WARNING experiment_analysis.py:644 -- Could not find best trial. Did you pass the correct `metric` parameter?
completed!
In [ ]:
Copied!
# ray.shutdown()
# ray.shutdown()
(pid=80657) /Users/aniket/miniconda3/envs/am/lib/python3.8/multiprocessing/resource_tracker.py:216: UserWarning: resource_tracker: There appear to be 32 leaked semaphore objects to clean up at shutdown (pid=80657) warnings.warn('resource_tracker: There appear to be %d '
Last update:
October 8, 2021