Skip to content

Commit

Permalink
Minor changes for Open Sourcing.
Browse files Browse the repository at this point in the history
PiperOrigin-RevId: 568780473
  • Loading branch information
The human_scene_transformer Authors committed Sep 28, 2023
1 parent 1ce78e5 commit aa7cc31
Show file tree
Hide file tree
Showing 13 changed files with 324 additions and 25 deletions.
60 changes: 54 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Human Scene Transformer

![Human Scene Transformer](./images/hero.png)
![Human Scene Transformer](./human_scene_transformer/images/hero.png)

Anticipating the motion of all humans in dynamic environments such as homes and offices is critical to enable safe and effective robot navigation. Such spaces remain challenging as humans do not follow strict rules of motion and there are often multiple occluded entry points such as corners and doors that create opportunities for sudden encounters. In this work, we present a Transformer based architecture to predict human future trajectories in human-centric environments from input features including human positions, head orientations, and 3D skeletal keypoints from onboard in-the-wild sensory information. The resulting model captures the inherent uncertainty for future human trajectory prediction and achieves state-of-the-art performance on common prediction benchmarks and a human tracking dataset captured from a mobile robot adapted for the prediction task. Furthermore, we identify new agents with limited historical data as a major contributor to error and demonstrate the complementary nature of 3D skeletal poses in reducing prediction error in such challenging scenarios.

Expand All @@ -24,6 +24,17 @@ If you use this work please cite our paper
## Data

### JRDB

We provide a extensive prep-processing pipeline to convert the JRDB dataset,
JRDB was created as a detection and tracking dataset rather than a prediction
dataset. To make the data suitable for a prediction task, we first extract the
robot motion from the raw sensor data to account for the robot's motion.
Further, on the JRDB training split we combine algorithmic detection with the
ground truth labels from the tracking dataset to create authentic tracks as
input and labels for HST.
Note that we do not purely use the ground truth hand labeled tracks in the JRDB
train dataset as we find them to be overly smoothed giving away the future human
movement.
To adapt the JRDB dataset for prediction please follow [this](/data) README.

Make sure to adapt `<data_path>` in `config/<jrdb/pedestrians>/dataset_params.gin` accordingly.
Expand All @@ -38,17 +49,14 @@ Please download the raw data [here](https://github.com/StanfordASL/Trajectron-pl

### JRDB
```
python train.py --model_base_dir=./model/jrdb --gin_files=.config/jrdb/training_params.gin --gin_files=.config/jrdb/model_params.gin --gin_files=.config/jrdb/dataset_params.gin --gin_files=.config/jrdb/metrics.gin --dataset=JRDB
python train.py --model_base_dir=./model/jrdb --gin_files=./config/jrdb/training_params.gin --gin_files=./config/jrdb/model_params.gin --gin_files=./config/jrdb/dataset_params.gin --gin_files=./config/jrdb/metrics.gin --dataset=JRDB
```

### Pedestrians ETH/UCY
```
python train.py --model_base_dir=./models/pedestrians_eth --gin_files=.config/pedestrians/training_params.gin --gin_files=.config/pedestrians/model_params.gin --gin_files=.config/pedestrians/dataset_params.gin --gin_files=.config/pedestrians/metrics.gin --dataset=PEDESTRIANS
python train.py --model_base_dir=./models/pedestrians_eth --gin_files=..config/pedestrians/training_params.gin --gin_files=..config/pedestrians/model_params.gin --gin_files=./config/pedestrians/dataset_params.gin --gin_files=./config/pedestrians/metrics.gin --dataset=PEDESTRIANS
```

## Checkpoints
Coming soon!

---

## Evaluation
Expand All @@ -58,7 +66,47 @@ Coming soon!
python jrdb/eval.py --model_path=./models/jrdb/ --checkpoint_path=./models/jrdb/ckpts/ckpt-30
```

#### Keypoints Impact Evaluation
```
python jrdb/eval_keypoints.py --model_path=./models/jrdb/ --checkpoint_path=./models/jrdb/ckpts/ckpt-30
```

vs

```
python jrdb/eval_keypoints.py --model_path=./models/jrdb_no_keypoints/ --checkpoint_path=./models/jrdb_no_keypoints/ckpts/ckpt-30
```

### Pedestrians ETH/UCY
```
python pedestrians/eval.py --model_path=./models/pedestrians_eth/ --checkpoint_path=./models/pedestrians_eth/ckpts/ckpt-20
```

---

## Results

Compared to the published paper we improved our data processing and fixed small
bugs in this code release. If you compare against our method please use the
following updated results.

On the JRDB dataset with dataset options as set [here](/config/jrdb/dataset_params.py):

| | AVG | @ 1s | @ 2s | @ 3s | @ 4s |
|--------|------|-------|------|-------|-------|
| MinADE | 0.26 | 0.12 | 0.20 | 0.28 | 0.37 |
| MinFDE | 0.45 | 0.21 | 0.39 | 0.56 | 0.71 |
| NLL |-0.59 | -0.90 | -0.65| -0.08 | 0.32 |

On the ETH/UCY Pedestrians Dataset:

| | ETH | Hotel | Univ | Zara1 | Zara2 | Avg |
|--------|------|-------|------|-------|-------|-------|
| MinADE | 0.41 | 0.10 | 0.24 | 0.17 | 0.14 | 0.21 |
| MinFDE | 0.73 | 0.14 | 0.44 | 0.30 | 0.24 | 0.37 |


### Checkpoints
You can download trained model checkpoints for both `JRDB` and `Pedestrians (ETH/UCY)` datasets [here]()(Coming Soon).

To evaluate the pre-trained checkpoints you will have to adjust the path to the dataset in the respective `params/operative_config.gin` file.
2 changes: 1 addition & 1 deletion human_scene_transformer/config/jrdb/dataset_params.gin
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ TEST_SCENES = ['clark-center-2019-02-28_1',
'tressider-2019-04-26_3_test']


JRDBDatasetParams.path = <dataset_path>
JRDBDatasetParams.path = '<dataset_path>'

JRDBDatasetParams.train_scenes = %TRAIN_SCENES
JRDBDatasetParams.eval_scenes = %TEST_SCENES
Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
PedestriansDatasetParams.path = <dataset_path>
PedestriansDatasetParams.path = '<dataset_path>'
PedestriansDatasetParams.dataset = 'eth'
PedestriansDatasetParams.train_config = 'train' # train, trainval
PedestriansDatasetParams.eval_config = 'val' # val, test
Expand Down
2 changes: 1 addition & 1 deletion human_scene_transformer/data/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
5. Download and extract `Train Detections` from the JRDB 2019 section to `<data_path>/detections`.

## Get the Leaderboard Test Set Tracks
3. Download and extract the best leaderboard [3D tracking result](https://jrdb.erc.monash.edu/leaderboards/download/1680) to `<data_path>/test_dataset/labels/raw_leaderboard/`.
Download and extract this leaderboard [3D tracking result](https://jrdb.erc.monash.edu/leaderboards/download/1605) to `<data_path>/test_dataset/labels/raw_leaderboard/`. Such that you have `<data_path>/test_dataset/labels/raw_leaderboard/00XX.txt` This is the best available leaderboard tracker at the time the code was developed.

## Get the Robot Odometry Preprocessed Keypoints

Expand Down
235 changes: 235 additions & 0 deletions human_scene_transformer/data/jrdb_preprocess_test.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,235 @@
# Copyright 2023 The human_scene_transformer Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

"""Preprocesses the raw test split of JRDB.
"""

import os

from human_scene_transformer.data import utils
import numpy as np
import pandas as pd
import tensorflow as tf
import tqdm

INPUT_PATH = '<dataset_path>'
OUTPUT_PATH = '<output_path>'

POINTCLOUD = True
AGENT_KEYPOINTS = True
FROM_DETECTIONS = True


def list_test_scenes(input_path):
scenes = os.listdir(os.path.join(input_path, 'images', 'image_0'))
scenes.sort()
return scenes


def get_agents_features_df_with_box(
input_path, scene_id, max_distance_to_robot=10.0
):
"""Returns agents features with bounding box from raw leaderboard data."""
jrdb_header = [
'frame',
'track id',
'type',
'truncated',
'occluded',
'alpha',
'bb_left',
'bb_top',
'bb_width',
'bb_height',
'x',
'y',
'z',
'height',
'width',
'length',
'rotation_y',
'score',
]
scene_data_file = utils.get_file_handle(
os.path.join(
input_path, 'labels', 'raw_leaderboard', f'{scene_id:04}' + '.txt'
)
)
df = pd.read_csv(scene_data_file, sep=' ', names=jrdb_header)

def camera_to_lower_velodyne(p):
return np.stack(
[p[..., 2], -p[..., 0], -p[..., 1] + (0.742092 - 0.606982)], axis=-1
)

df = df[df['score'] >= 0.01]

df['p'] = df[['x', 'y', 'z']].apply(
lambda s: camera_to_lower_velodyne(s.to_numpy()), axis=1
)
df['distance'] = df['p'].apply(lambda s: np.linalg.norm(s, axis=-1))
df['l'] = df['height']
df['h'] = df['width']
df['w'] = df['length']
df['yaw'] = df['rotation_y']

df['id'] = df['track id'].apply(lambda s: f'pedestrian:{s}')
df['timestep'] = df['frame']

df = df.set_index(['timestep', 'id'])

df = df[df['distance'] <= max_distance_to_robot]

return df[['p', 'yaw', 'l', 'h', 'w']]


def jrdb_preprocess_test(input_path, output_path):
scenes = list_test_scenes(os.path.join(input_path, 'test_dataset'))
subsample = 1
for scene in tqdm.tqdm(scenes):
scene_save_name = scene + '_test'
agents_df = get_agents_features_df_with_box(
os.path.join(input_path, 'test_dataset'),
scenes.index(scene),
max_distance_to_robot=15.0,
)

robot_odom = utils.get_robot(
os.path.join(input_path, 'processed', 'odometry_test'), scene
)

if AGENT_KEYPOINTS:
keypoints = utils.get_agents_keypoints(
os.path.join(
input_path, 'processed', 'labels', 'labels_3d_keypoints_test'
),
scene,
)
keypoints_df = pd.DataFrame.from_dict(
keypoints, orient='index'
).rename_axis(['timestep', 'id']) # pytype: disable=missing-parameter # pandas-drop-duplicates-overloads

agents_df = agents_df.join(keypoints_df)
agents_df.keypoints.fillna(
dict(
zip(
agents_df.index[agents_df['keypoints'].isnull()],
[np.ones((33, 3)) * np.nan]
* len(
agents_df.loc[
agents_df['keypoints'].isnull(), 'keypoints'
]
),
)
),
inplace=True,
)

robot_df = pd.DataFrame.from_dict(robot_odom, orient='index').rename_axis( # pytype: disable=missing-parameter # pandas-drop-duplicates-overloads
['timestep']
)
# Remove extra data odometry datapoints
robot_df = robot_df.iloc[agents_df.index.levels[0]]

assert (agents_df.index.levels[0] == robot_df.index).all()

# Subsample
assert len(agents_df.index.levels[0]) == agents_df.index.levels[0].max() + 1
agents_df_subsampled_index = agents_df.unstack('id').iloc[::subsample].index
agents_df = (
agents_df.unstack('id')
.iloc[::subsample]
.reset_index(drop=True)
.stack('id', dropna=True)
)

agents_in_odometry_df = utils.agents_to_odometry_frame(
agents_df, robot_df.iloc[::subsample].reset_index(drop=True)
)

agents_pos_ragged_tensor = utils.agents_pos_to_ragged_tensor(
agents_in_odometry_df
)
agents_yaw_ragged_tensor = utils.agents_yaw_to_ragged_tensor(
agents_in_odometry_df
)
assert (
agents_pos_ragged_tensor.shape[0] == agents_yaw_ragged_tensor.shape[0]
)

tf.data.Dataset.from_tensors(agents_pos_ragged_tensor).save(
os.path.join(output_path, scene_save_name, 'agents', 'position')
)
tf.data.Dataset.from_tensors(agents_yaw_ragged_tensor).save(
os.path.join(output_path, scene_save_name, 'agents', 'orientation')
)

if AGENT_KEYPOINTS:
agents_keypoints_ragged_tensor = utils.agents_keypoints_to_ragged_tensor(
agents_in_odometry_df
)
tf.data.Dataset.from_tensors(agents_keypoints_ragged_tensor).save(
os.path.join(output_path, scene_save_name, 'agents', 'keypoints')
)

robot_in_odometry_df = utils.robot_to_odometry_frame(robot_df)
robot_pos = tf.convert_to_tensor(
np.stack(robot_in_odometry_df.iloc[::subsample]['p'].values).astype(
np.float32
)
)
robot_orientation = tf.convert_to_tensor(
np.stack(robot_in_odometry_df.iloc[::subsample]['yaw'].values).astype(
np.float32
)
)[..., tf.newaxis]

tf.data.Dataset.from_tensors(robot_pos).save(
os.path.join(output_path, scene_save_name, 'robot', 'position')
)
tf.data.Dataset.from_tensors(robot_orientation).save(
os.path.join(output_path, scene_save_name, 'robot', 'orientation')
)

if POINTCLOUD:
scene_pointcloud_dict = utils.get_scene_poinclouds(
os.path.join(input_path, 'test_dataset'), scene, subsample=subsample
)
# Remove extra timesteps
scene_pointcloud_dict = {
ts: scene_pointcloud_dict[ts] for ts in agents_df_subsampled_index
}

scene_pc_odometry = utils.pc_to_odometry_frame(
scene_pointcloud_dict, robot_df
)

filtered_pc = utils.filter_agents_and_ground_from_point_cloud(
agents_in_odometry_df, scene_pc_odometry, robot_in_odometry_df
)

scene_pc_ragged_tensor = tf.ragged.stack(filtered_pc)

assert (
agents_pos_ragged_tensor.bounding_shape()[1]
== scene_pc_ragged_tensor.shape[0]
)

tf.data.Dataset.from_tensors(scene_pc_ragged_tensor).save(
os.path.join(output_path, scene_save_name, 'scene', 'pc'),
compression='GZIP',
)

if __name__ == '__main__':
jrdb_preprocess_test(INPUT_PATH, OUTPUT_PATH)
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@

INPUT_PATH = '<data_path>'
OUTPUT_PATH = os.path.join(
input_path, '/processed/labels/labels_detections_3d')
INPUT_PATH, 'processed/labels/labels_detections_3d')


def get_agents_3d_bounding_box_dict(input_path, scene):
Expand Down Expand Up @@ -129,7 +129,6 @@ def jrdb_train_detections_to_tracks(input_path, output_path):
scenes = utils.list_scenes(
os.path.join(input_path, 'train_dataset'))
for scene in tqdm.tqdm(scenes):
print(f'Processing {scene}')
bb_dict = get_agents_3d_bounding_box_dict(
os.path.join(input_path, 'train_dataset'), scene)
bb_3d_df = pd.DataFrame.from_dict(
Expand Down Expand Up @@ -174,7 +173,7 @@ def jrdb_train_detections_to_tracks(input_path, output_path):

labels_dict = detections_to_dict(matched_df)

with os.Open(f"{output_path}/{scene}.json", 'w') as write_file:
with open(f"{output_path}/{scene}.json", 'w') as write_file:
json.dump(labels_dict, write_file, indent=2, ensure_ascii=True)

if __name__ == '__main__':
Expand Down
3 changes: 2 additions & 1 deletion human_scene_transformer/data/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,12 +37,13 @@ def maybe_makedir(path):


def get_file_handle(path, mode='rt'):
file_handle = os.Open(path, mode=mode)
file_handle = open(path, mode)
return file_handle


def list_scenes(input_path):
scenes = os.listdir(os.path.join(input_path, 'labels', 'labels_3d'))
scenes.sort()
return [scene[:-5] for scene in scenes]


Expand Down
Loading

0 comments on commit aa7cc31

Please sign in to comment.