You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thank you for the comprehensive guidelines on fine-tuning LLaVA on custom datasets! The provided example for generating tag-style captions for Stable Diffusion has been very helpful.
I am currently working on fine-tuning LLaVA for object detection tasks and have created the following JSON format:
"image": "part-000001/997bb945-628d-4724-b370-b84de974a19f.jpg",
"conversations": [
{
"from": "human",
"value": "Find objects of the classes vehicle, human, swimming pool, landfill, building, tree, satellite antenna and street light in the image. Provide the coordinates of the bounding boxes in the format class: [bounding box coordinates]."
},
{
"from": "gpt",
"value": "vehicle: [0.85, 0.139, 0.024, 0.049], vehicle: [0.603, 0.158, 0.029, 0.055], human: [0.945, 0.018, 0.014, 0.036], human: [0.83, 0.028, 0.017, 0.041], human: [0.85, 0.123, 0.024, 0.067]."
}
]
Could you please confirm if this JSON structure is correct for fine-tuning LLaVA on object detection tasks? Specifically, I would like to know:
Is the structure of the JSON file appropriate for object detection?
Are the metadata fields correctly defined?
Is the format for bounding box coordinates accurate?
Any additional insights or corrections would be greatly appreciated.
Thank you for your assistance!
The text was updated successfully, but these errors were encountered:
Question
Hi,
Thank you for the comprehensive guidelines on fine-tuning LLaVA on custom datasets! The provided example for generating tag-style captions for Stable Diffusion has been very helpful.
I am currently working on fine-tuning LLaVA for object detection tasks and have created the following JSON format:
Could you please confirm if this JSON structure is correct for fine-tuning LLaVA on object detection tasks? Specifically, I would like to know:
Any additional insights or corrections would be greatly appreciated.
Thank you for your assistance!
The text was updated successfully, but these errors were encountered: