Openmmla Vision 0.1.0.Post4 | Coderz Repository

Description:

openmmlavision 0.1.0.post4

🎥 OpenMMLA Vision

Video module of the mBox - an open multimodal learning analytic platform. For more details, please refer
to mBox System Design.
Table of Contents

Related Modules
Installation

Uber Server Setup
Video Base & Server Setup
Standalone Setup

Usage

Realtime Indoor-Positioning
Video Frame Analyzer

Visualization
FAQ
Citation
References
License

Related Modules

mbox-uber
mbox-audio

Installation
Uber Server Setup
Before setting up the video base, you need to set up a server hosting the InfluxDB, Redis, Mosquitto, and Nginx
services. Please refer to mbox-uber module.
Video Base & Server Setup

Clone the repository
git clone https://github.com/ucph-ccs/mbox-video.git

Install openmmla-vision

Set up Conda environment
# For Raspberry Pi
wget "https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-$(uname)-$(uname -m).sh"
bash Miniforge3-$(uname)-$(uname -m).sh

# For Mac and Linux
wget "https://repo.anaconda.com/miniconda/Miniconda3-latest-$(uname)-$(uname -m).sh"
bash Miniconda3-latest-$(uname)-$(uname -m).sh

Install Video Base
conda create -c conda-forge -n video-base python=3.10.12 -y
conda activate video-base
pip install openmmla-vision[base] # for Linux and Raspberry Pi
pip install 'openmmla-vision[base]' # for Mac

Install Video Server
The video server provides video frame analyzer services.
conda create -c conda-forge -n video-server python=3.10.12 -y
conda activate video-server
pip install openmmla-vision[server] # for Linux and Raspberry Pi
pip install 'openmmla-vision[server]' # for Mac

Set up folder structure
cd mbox-video
./reset.sh

Standalone Setup
If you want to run the entire mBox Video system on a single machine, follow these steps:

Set up the Uber Server on your machine following the instructions in
the mbox-uber module.

Install openmmla-vision with all dependencies:
conda create -c conda-forge -n mbox-video python=3.10.12 -y
conda activate mbox-video
pip install openmmla-vision[all] # for Linux and Raspberry Pi
pip install 'openmmla-vision[all]' # for Mac

Set up the folder structure:
cd mbox-video
./reset.sh

This setup will allow you to run all components of mBox Video on a single machine.
Usage
Realtime Indoor-Positioning

Stream video from camera(s)

Distributed: stream on each camera host machine (e.g. Raspberry Pi, Mac, Linux, etc.)
Centralized: stream to a centralized RTMP server (e.g. client/server, see Raspberry Pi RTMP streaming setup)

Calibrate camera's intrinsic parameters

Print chessboard image from ./camera_calib/pattern/ and stick it on a flat surface
Capture chessboard image with your camera and calibrate it by running ./calib_camera.sh

Synchronize multi-cameras' coordinate systems
Calculate transformation matrix between main and alternative cameras:
./sync_camera.sh [-d <num_cameras>] [-s <num_sync_managers>]

Default parameter settings:

-d: 2 (number of cameras to sync)
-s: 1 (number of camera sync manager)

Modes:

Centralized:
./sync_camera.sh -d 2 -s 1

Distributed:
# On camera host (e.g., Raspberry Pi)
./sync_camera.sh -d 1 -s 0
# On synchronizer (e.g., MacBook)
./sync_camera.sh -d 0 -s 1

Run real-time indoor-positioning system
./run.sh [-b <num_bases>] [-s <num_synchronizers>] [-v <num_visualizers>] [-g <display_graphics>] [-r <record_frames>] [-v <store_visualizations>]

Default parameter settings:

-b: 1 (number of video base)
-s: 1 (number of video base synchronizer)
-v: 1 (number of visualizer)
-g: true (display graphic window)
-r: false (record video frames as images)
-v: false (store real-time visualizations)

Modes:

Centralized:
./run.sh

Distributed:
# On camera host (e.g., Raspberry Pi)
./run.sh -b 1 -s 0 -v 0 -g false
# On synchronizer (e.g., MacBook)
./run.sh -b 0 -s 1 -v 1

Video Frame Analyzer

Serve VLM and LLM on video server

vllm
vllm serve openbmb/MiniCPM-V-2_6 --dtype auto --max-model-len 2048 --port 8000 --api-key token-abc123 --gpu_memory_utilization 1 --trust-remote-code --enforce-eager
vllm serve microsoft/Phi-3-small-128k-instruct --dtype auto --max-model-len 1028 --port 8001 --api-key token-abc123 --gpu_memory_utilization 0.8 --trust-remote-code --enforce-eager

ollama
Install Ollama from official website.
ollama pull llava:13b
ollama pull llama3.1

Configure conf/video_base.ini
[Server]
backend = ollama
top_p = 0.1
temperature = 0
vlm_model = llava:13b
llm_model = llama3.1

Serve frame analyzer on video server
cd examples/
python video_frame_analyzer_server.py

Run client script on video base
python analyze_video_frame.py

Visualization
After running the analyzers, logs and visualizations are stored in the /logs/ and /visualizations/ folders.
The following image shows a simple demo of the video frame analyzer:

FAQ
Citation
If you use this code in your research, please cite the following paper:
@inproceedings{inproceedings,
author = {Li, Zaibei and Jensen, Martin and Nolte, Alexander and Spikol, Daniel},
year = {2024},
month = {03},
pages = {785-791},
title = {Field report for Platform mBox: Designing an Open MMLA Platform},
doi = {10.1145/3636555.3636872}
}

References

apriltags

License
This project is licensed under the MIT License - see the LICENSE file for details.

Files In This Product: (if this is empty don't purchase this product)

There are no reviews.

zed

openmmla-vision 0.1.0.post4

Languages

Categories

Description:

License:

Share

Files In This Product: (if this is empty don't purchase this product)

Overview

What you can do with it

What you can't do with it

Related Products

Views For YouTube Bot writed on Python

AI-Web-Scraper

quivr

roop

openmmla-vision 0.1.0.post4

Languages

Categories

Description:

License:

Share

Files In This Product: (if this is empty don't purchase this product)

Customer Reviews

License

Overview

What you can do with it

What you can't do with it

Related Products

Views For YouTube Bot writed on Python

AI-Web-Scraper

quivr

roop

zed