PolyU develops innovative Language Model Linguistic Personality Assessment system, advancing AI for diverse applications in manufacturing, business and education (IMAGE)
Caption
Scatter plots illustrate the effect of reversing the rating scale on the consistency of GPT-4-Turbo’s responses to 44 questions. Circles on the plots highlight discrepancies between these conditions, indicating inconsistencies. The left plot, using the BFI, shows 16 inconsistencies with a Cohen’s Weighted Kappa of 0.401. The right plot, from our rating system, displays fewer inconsistencies (6 total) with a higher Cohen’s Weighted Kappa of 0.877, demonstrating strong agreement and enhanced system reliability.
Credit
© 2025 Research and Innovation Office, The Hong Kong Polytechnic University. All Rights Reserved.
Usage Restrictions
nil
License
Original content