Last updated:
0 purchases
openmmlavision 0.1.0.post4
🎥 OpenMMLA Vision
Video module of the mBox - an open multimodal learning analytic platform. For more details, please refer
to mBox System Design.
Table of Contents
Related Modules
Installation
Uber Server Setup
Video Base & Server Setup
Standalone Setup
Usage
Realtime Indoor-Positioning
Video Frame Analyzer
Visualization
FAQ
Citation
References
License
Related Modules
mbox-uber
mbox-audio
Installation
Uber Server Setup
Before setting up the video base, you need to set up a server hosting the InfluxDB, Redis, Mosquitto, and Nginx
services. Please refer to mbox-uber module.
Video Base & Server Setup
Clone the repository
git clone https://github.com/ucph-ccs/mbox-video.git
Install openmmla-vision
Set up Conda environment
# For Raspberry Pi
wget "https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-$(uname)-$(uname -m).sh"
bash Miniforge3-$(uname)-$(uname -m).sh
# For Mac and Linux
wget "https://repo.anaconda.com/miniconda/Miniconda3-latest-$(uname)-$(uname -m).sh"
bash Miniconda3-latest-$(uname)-$(uname -m).sh
Install Video Base
conda create -c conda-forge -n video-base python=3.10.12 -y
conda activate video-base
pip install openmmla-vision[base] # for Linux and Raspberry Pi
pip install 'openmmla-vision[base]' # for Mac
Install Video Server
The video server provides video frame analyzer services.
conda create -c conda-forge -n video-server python=3.10.12 -y
conda activate video-server
pip install openmmla-vision[server] # for Linux and Raspberry Pi
pip install 'openmmla-vision[server]' # for Mac
Set up folder structure
cd mbox-video
./reset.sh
Standalone Setup
If you want to run the entire mBox Video system on a single machine, follow these steps:
Set up the Uber Server on your machine following the instructions in
the mbox-uber module.
Install openmmla-vision with all dependencies:
conda create -c conda-forge -n mbox-video python=3.10.12 -y
conda activate mbox-video
pip install openmmla-vision[all] # for Linux and Raspberry Pi
pip install 'openmmla-vision[all]' # for Mac
Set up the folder structure:
cd mbox-video
./reset.sh
This setup will allow you to run all components of mBox Video on a single machine.
Usage
Realtime Indoor-Positioning
Stream video from camera(s)
Distributed: stream on each camera host machine (e.g. Raspberry Pi, Mac, Linux, etc.)
Centralized: stream to a centralized RTMP server (e.g. client/server, see Raspberry Pi RTMP streaming setup)
Calibrate camera's intrinsic parameters
Print chessboard image from ./camera_calib/pattern/ and stick it on a flat surface
Capture chessboard image with your camera and calibrate it by running ./calib_camera.sh
Synchronize multi-cameras' coordinate systems
Calculate transformation matrix between main and alternative cameras:
./sync_camera.sh [-d <num_cameras>] [-s <num_sync_managers>]
Default parameter settings:
-d: 2 (number of cameras to sync)
-s: 1 (number of camera sync manager)
Modes:
Centralized:
./sync_camera.sh -d 2 -s 1
Distributed:
# On camera host (e.g., Raspberry Pi)
./sync_camera.sh -d 1 -s 0
# On synchronizer (e.g., MacBook)
./sync_camera.sh -d 0 -s 1
Run real-time indoor-positioning system
./run.sh [-b <num_bases>] [-s <num_synchronizers>] [-v <num_visualizers>] [-g <display_graphics>] [-r <record_frames>] [-v <store_visualizations>]
Default parameter settings:
-b: 1 (number of video base)
-s: 1 (number of video base synchronizer)
-v: 1 (number of visualizer)
-g: true (display graphic window)
-r: false (record video frames as images)
-v: false (store real-time visualizations)
Modes:
Centralized:
./run.sh
Distributed:
# On camera host (e.g., Raspberry Pi)
./run.sh -b 1 -s 0 -v 0 -g false
# On synchronizer (e.g., MacBook)
./run.sh -b 0 -s 1 -v 1
Video Frame Analyzer
Serve VLM and LLM on video server
vllm
vllm serve openbmb/MiniCPM-V-2_6 --dtype auto --max-model-len 2048 --port 8000 --api-key token-abc123 --gpu_memory_utilization 1 --trust-remote-code --enforce-eager
vllm serve microsoft/Phi-3-small-128k-instruct --dtype auto --max-model-len 1028 --port 8001 --api-key token-abc123 --gpu_memory_utilization 0.8 --trust-remote-code --enforce-eager
ollama
Install Ollama from official website.
ollama pull llava:13b
ollama pull llama3.1
Configure conf/video_base.ini
[Server]
backend = ollama
top_p = 0.1
temperature = 0
vlm_model = llava:13b
llm_model = llama3.1
Serve frame analyzer on video server
cd examples/
python video_frame_analyzer_server.py
Run client script on video base
python analyze_video_frame.py
Visualization
After running the analyzers, logs and visualizations are stored in the /logs/ and /visualizations/ folders.
The following image shows a simple demo of the video frame analyzer:
FAQ
Citation
If you use this code in your research, please cite the following paper:
@inproceedings{inproceedings,
author = {Li, Zaibei and Jensen, Martin and Nolte, Alexander and Spikol, Daniel},
year = {2024},
month = {03},
pages = {785-791},
title = {Field report for Platform mBox: Designing an Open MMLA Platform},
doi = {10.1145/3636555.3636872}
}
References
apriltags
License
This project is licensed under the MIT License - see the LICENSE file for details.
For personal and professional use. You cannot resell or redistribute these repositories in their original state.
There are no reviews.