computer-vision

DEV Community

Generate Professional AI Images Locally with ComfyUI and FLUX

EveryLocalAI

6h ago

Professional-grade image generation that runs entirely on your own GPU. ComfyUI + FLUX.1 Dev gives you Midjourney-quality output with full creative control and zero data leaving your machine. What You Need A GPU with 12GB+ VRAM (24GB recommended) Python 3.10+ or the ComfyUI desktop app About 20 minutes Setup Option A: Desktop App (Easiest) Download from comfy.org, install, and use the built-in mo…

aicomputer-visionmachine-learning

Hacker News

Show HN: Dual YOLOv8n UAV Detection on RK3588S at 42 FPS Using NPU

14h ago

aicomputer-vision

Biological sciences : Scientific Reports subject feeds

Multi-layer feature aggregation network with residual module and attention mechanism for jaw cyst image segmentation

Huixia Zheng et al.

16h ago

aicomputer-vision

DEV Community

Stop Guessing Your Meds: Building a Multimodal RAG Assistant with LLaVA and ChromaDB

Beck_Moulton

1d ago

Ever stared at a cryptic medicine bottle, wondering if it interacts with your morning coffee or that other pill you're taking? For the elderly or those with visual impairments, reading tiny labels on medication packaging is more than a nuisance—it’s a safety hazard. In this tutorial, we are building a Medication Safety Assistant . This isn't just a simple OCR tool; we are implementing a Multimoda…

aicomputer-visionmachine-learning

Research Communities by Springer Nature

A CNN Framework for Real-Time and Edge Deployment with High Accuracy for Lightweight Deep Learning for Bengali OCR

DIPANKAR DEY

1d ago

Bengali OCR is challenging due to complex characters and handwriting variations. This paper proposes a lightweight CNN with 40×40 input for real-time Bengali OCR, achieving 98.29% accuracy with reduced computational complexity for edge deployment.

aicomputer-visiondeep-learning

DEV Community

How to Add Living Photo Effects to Your Web Portfolio

Jakub

1d ago

Static portfolios blend together. Every designer's site has the same grid of JPEGs. We wanted something different for our own product pages at Inithouse, a studio shipping a growing portfolio of products in parallel, so we started experimenting with living photos: short AI-generated animations that make a still image breathe. Here's how we did it, what we learned about performance, and the code y…

aicomputer-vision

Lifeboat News: The Blog

AI vs the Human Eye: Can Algorithms See What Ophthalmologists Miss?

Shubham Ghosh Roy

2d ago

Autonomous AI models can bring specialist-level retinal screening to clinics worldwide, aiding the diagnosis of diabetic retinopathy and other eye conditions.

aicomputer-visionmachine-learning

Hacker News

How to automate Instagram engagements with computer vision (and get banned)

Florian Herrengt

2d ago

Obviously, Instagram does not want you to automate engagement. Their HTML is a mess of randomly generated class names and deeply nested divs. The structure changes every deployment. Any script that relies on DOM selectors breaks within weeks because the class name doesn't exist anymore. But it doesn't matter anyway. Instagram can obfuscate their code all they want because code is for machines. Bu…

computer-sciencecomputer-vision

Biological sciences : Scientific Reports subject feeds

Automated error localisation and correction techniques for deep-learning-based segmentation of 3D MRI sequences based on feature-derived-region aggregation

Adrian C. Ruckli et al.

2d ago

aicomputer-sciencecomputer-visiondeep-learning

Frontiers in Earth Science | New and Recent Articles

LCS-Net: a lightweight architecture for efficient coastal water segmentation

Xinkun Song

2d ago

In high-resolution remote sensing imagery, near-shore water bodies typically exhibit tortuous shorelines, fragmented lakeshore coves, and superimposed disturbances such as building reflections, vegetation shadows, and mixed substrates, posing significant challenges to the fine extraction of water boundaries. Although deep learning-based semantic segmentation has substantially improved water body …

algorithmscomputer-sciencecomputer-vision

Frontiers in Artificial Intelligence | New and Recent Articles

Adaptive quadtree-based segmentation of nucleus and cytoplasm in pap-smear images: a lightweight and interpretable approach for automated cytology

Andrew Ware

2d ago

BackgroundAutomated analysis of Pap-smear images plays an important role in cervical cancer screening, particularly in low-resource settings where manual cytology remains labour-intensive, subjective, and prone to inter-observer variability. On the other hand, accurate segmentation of the nucleus and cytoplasm is a fundamental step in computer-aided diagnosis systems because it enables quantitati…

aicomputer-visionmachine-learningmedicine

Biological sciences : Scientific Reports subject feeds

ViT-ConvGAN: a hybrid model for spatiotemporal action recognition using video transformer and 3D CNN

3d ago

aicomputer-sciencecomputer-visiondeep-learning

Roboflow Blog

OCR Lot Code and Expiry Date Verification for Medical Packaging

Mostafa Ibrahim

3d ago

Train a Roboflow localization model, isolate printed label fields, and verify batch numbers and expiry dates with Google Gemini.

aicomputer-vision

Scientific Reports

Lightweight CNN SE transformer for robust weed classification with optimizer aware performance

Vetriselvi T

3d ago

Scientific Reports, Published online: 11 June 2026; doi:10.1038/s41598-026-52345-6 Lightweight CNN SE transformer for robust weed classification with optimizer aware performance

aicomputer-visionmachine-learning

Roboflow Blog

Launch: Use YOLO26 Semantic Segmentation with Roboflow

Contributing Writer

3d ago

We are excited to announce support for YOLO26 semantic segmentation in Roboflow.

aicomputer-vision

Roboflow Blog

Claude Fable 5 for Vision: Evaluation and Benchmarks

Erik Kokalj

3d ago

Claude Fable 5 is a strong reasoning model for visual understand but not a state-of-the-art vision model.

aicomputer-vision

Roboflow Blog

Give My Agent Eyes

Yajat Mittal

3d ago

Combine RF-DETR and LMMs to build an AI pipeline that perceives, reasons, and acts.

aicomputer-vision

Scientific Reports

Language-assisted multimodal convolutional transformer pipeline for retinal lesions segmentation

Taimur Hassan

3d ago

Scientific Reports, Published online: 11 June 2026; doi:10.1038/s41598-026-55337-8 Language-assisted multimodal convolutional transformer pipeline for retinal lesions segmentation

aicomputer-visiondeep-learning

Biological sciences : Scientific Reports subject feeds

Comparative performance of one-stage and two-stage deep learning models for instance segmentation of overhanging dental restorations on bitewing radiographs

4d ago

aicomputer-visiondeep-learningmedicine

The Guardian

Florida lawsuit alleges wrongful arrest after police AI facial recognition error

Richard Luscombe in Miami

4d ago

Robert Dillon was arrested at home in Florida despite living 300 miles away, and charges were later dropped Sign up for the Breaking News US newsletter email A Florida man is suing several law enforcement agencies for his arrest and prosecution for allegedly luring a child after he was wrongly identified using faulty AI facial recognition software. According to the Jacksonville Beach police depar…

aiai-ethicscomputer-vision

research.io

Sign up to keep scrolling

Create your feed subscriptions, save articles, keep scrolling.

Already have an account?