Site is currently under maintenance
Some features may be unavailable or limited during this time. We apologize for any inconvenience and appreciate your patience.

Automated Author Profile

Dong, Jin Song

National University of Singapore

Current S-Index

3.2

Sum of Dataset Indices for all datasets

Average Dataset Index per Dataset

1.6

Average Dataset Index per dataset

Total Datasets

2

Total datasets for this author

Average FAIR Score

73.1%

Average FAIR Score per dataset

Total Citations

0

Total citations to the author's datasets

Total Mentions

0

Total mentions of the author's datasets

S-Index Interpretation

S-Index Over Time

Cumulative Citations Over Time

Cumulative Mentions Over Time

Datasets

PhishDecloaker Datasets

This record contains datasets part of the paper: "PhishDecloaker: Detecting CAPTCHA-cloaked Phishing Websites via HybridVision-based Interactive Models", published at USENIX Security'24.Phishing Kit DatasetSection: 2Description: For empirical study. Contents: 100 defanged PHP phishing kits representing the following list of brands1. Microsoft2. Banco de Oro3. Microsoft OneDrive4. Deutsche Kreditbank5. Adobe Acrobat6. N267. Absa Group8. DHL9. Microsoft10. Correos11. Kempinski Summerland Hotel & Resort Beirut12. Vantage West Credit Union13. NetFlix14. Agencia Tributaria15. Square16. Chronopost17. PayPal18. American Express19. Allegro20. LinkedIn21. virtru22. Citibank23. AOL24. Credit Agricole25. Mercado Pago26. Université de Pau et des Pays de l'Adour (UPPA)27. Fifth Third Banki28. Columbia Bank29. Alibaba Mail30. Microsoft OneDrive31. Intesa Sanpaolo32. Santander33. America First Credit Union34. Barclays35. Interac36. USPS37. Wells Fargo38. Yahoo39. XFINITY40. Berliner Sparkasse41. OneDrive42. Standard Bank43. Wells Fargo44. aruba.it45. Bancolombia46. Caisse d’Epargne47. DubaiPay48. Chase Bank49. M&T Bank50. Postmaster51. Volksbanken Raiffeisenbanken52. Facebook53. Huntington Bank54. Commonwealth Bank of Australia55. Orange56. shopify57. Google Drive58. WalletConnect59. Meritrust Credit Union60. Credit Agricole61. Desjardins62. Postbank63. Dropbox64. DocuSign65. dpdgroup66. L'Assurance Maladie67. Adobe Acrobat68. Global Sources69. Microsoft Excel70. SFR71. FedEx72. Citibank73. Royal Credit Union74. GoDaddy75. ADP76. International Card Services77. Israeli Post78. UNI Financial Cooperation79. TD Bank80. ATB Mobile81. HSBC82. Bank of Montreal83. RBC Royal Bank84. IONOS85. AlaskaUSA Federal Credit Union86. French Government87. UOL SAC88. Banco Itaú Paraguay89. Amazon90. Apple91. AT&T92. Australian Government93. Bank of America94. BNP Paribas95. eBay96. ING Group97. Instagram98. MetaMask99. SingTel100. Société GénéraleLandscape DatasetSection: 4.3Description: For training the rotation CAPTCHA solver model.Contents: 7,268 natural and man-made landscape images (320×180).Format: JPEG images.CAPTCHA Detection DatasetSection: 5.2.1Description: For training the CAPTCHA detection model.Contents: 19,680 webpage screenshots (1920×1080), 10,680 with annotated CAPTCHA bounding boxes, 9,000 without.Format: PNG images with annotations in PASCAL VOC and COCO format.All bounding boxes are labeled as the "CAPTCHA" class (no CAPTCHA type categorization).CAPTCHA Recognition DatasetSection: 5.2.2Description: For training the CAPTCHA recognition modelContents: 6,612 CAPTCHA images distributed across 38 classes.Format: PNG images with their corresponding class labels in CSVCAPTCHA classes:1. baidu_slide_rotate2. dingxiang_audio3. dingxiang_click_area4. dingxiang_click_difference5. dingxiang_click_font6. dingxiang_click_icon7. dingxiang_click_vr8. dingxiang_click_word9. dingxiang_drag10. dingxiang_slide_puzzle11. dingxiang_slide_puzzle212. dingxiang_slide_rotate13. geetest_checkbox14. geetest_click_icon15. geetest_click_phrase16. geetest_click_word17. geetest_game_playing18. geetest_game_playing219. geetest_select20. geetest_slide_puzzle21. hcaptcha22. hcaptcha_checkbox23. netease_click_icon24. netease_click_phrase25. netease_click_vr26. netease_click_word27. netease_drag28. netease_slide29. press_and_hold30. recaptchav231. recaptchav2_checkbox32. tencent_slide33. text_134. text_235. text_336. text_437. text_538. text_6CAPTCHA Open-set DatasetSection: 5.2.2Description: For testing the CAPTCHA detection and recognition pipeline.Contents: 1,100 webpage screenshots (1920×1080), all of which have annotated CAPTCHA classes spanning 11 different categories.Format: PNG CAPTCHA and screenshot images with their corresponding class labels in CSVCAPTCHA classes:1. arkose_select_22. capycaptcha_drag3. dicecaptcha_qa4. funcaptcha_select5. funcaptcha_select_26. funcaptcha_select_37. funcaptcha_select_48. funcaptcha_select_59. funcaptcha_select_610. keycaptcha_drag11. mtcaptcha_textAblation DatasetSection: 5.4Description: For training the CAPTCHA recognition modelContents: 722 webpage screenshots (1920×1080), 622 with CAPTCHAs spanning 38 classes, 100 without.Format: PNG images with their corresponding bounding box and class labels in CSV. Class IDs 0-37 can be directly mapped to class names in CAPTCHA recognition dataset. Class ID 38 are samples without CAPTCHAs.

Authors

  • Teoh, Xiwen ;
  • Lin, Yun ;
  • Liu, Ruofan ;
  • Huang, Zhiyong ;
  • Dong, Jin Song
0 Citations0 Mentions73% FAIR1.6 Dataset Index
10.5281/zenodo.11228973May 2024

PhishDecloaker Datasets

This record contains datasets part of the paper: "PhishDecloaker: Detecting CAPTCHA-cloaked Phishing Websites via HybridVision-based Interactive Models", published at USENIX Security'24.Phishing Kit DatasetSection: 2Description: For empirical study. Contents: 100 defanged PHP phishing kits representing the following list of brands1. Microsoft2. Banco de Oro3. Microsoft OneDrive4. Deutsche Kreditbank5. Adobe Acrobat6. N267. Absa Group8. DHL9. Microsoft10. Correos11. Kempinski Summerland Hotel & Resort Beirut12. Vantage West Credit Union13. NetFlix14. Agencia Tributaria15. Square16. Chronopost17. PayPal18. American Express19. Allegro20. LinkedIn21. virtru22. Citibank23. AOL24. Credit Agricole25. Mercado Pago26. Université de Pau et des Pays de l'Adour (UPPA)27. Fifth Third Banki28. Columbia Bank29. Alibaba Mail30. Microsoft OneDrive31. Intesa Sanpaolo32. Santander33. America First Credit Union34. Barclays35. Interac36. USPS37. Wells Fargo38. Yahoo39. XFINITY40. Berliner Sparkasse41. OneDrive42. Standard Bank43. Wells Fargo44. aruba.it45. Bancolombia46. Caisse d’Epargne47. DubaiPay48. Chase Bank49. M&T Bank50. Postmaster51. Volksbanken Raiffeisenbanken52. Facebook53. Huntington Bank54. Commonwealth Bank of Australia55. Orange56. shopify57. Google Drive58. WalletConnect59. Meritrust Credit Union60. Credit Agricole61. Desjardins62. Postbank63. Dropbox64. DocuSign65. dpdgroup66. L'Assurance Maladie67. Adobe Acrobat68. Global Sources69. Microsoft Excel70. SFR71. FedEx72. Citibank73. Royal Credit Union74. GoDaddy75. ADP76. International Card Services77. Israeli Post78. UNI Financial Cooperation79. TD Bank80. ATB Mobile81. HSBC82. Bank of Montreal83. RBC Royal Bank84. IONOS85. AlaskaUSA Federal Credit Union86. French Government87. UOL SAC88. Banco Itaú Paraguay89. Amazon90. Apple91. AT&T92. Australian Government93. Bank of America94. BNP Paribas95. eBay96. ING Group97. Instagram98. MetaMask99. SingTel100. Société GénéraleLandscape DatasetSection: 4.3Description: For training the rotation CAPTCHA solver model.Contents: 7,268 natural and man-made landscape images (320×180).Format: JPEG images.CAPTCHA Detection DatasetSection: 5.2.1Description: For training the CAPTCHA detection model.Contents: 19,680 webpage screenshots (1920×1080), 10,680 with annotated CAPTCHA bounding boxes, 9,000 without.Format: PNG images with annotations in PASCAL VOC and COCO format.All bounding boxes are labeled as the "CAPTCHA" class (no CAPTCHA type categorization).CAPTCHA Recognition DatasetSection: 5.2.2Description: For training the CAPTCHA recognition modelContents: 6,612 CAPTCHA images distributed across 38 classes.Format: PNG images with their corresponding class labels in CSVCAPTCHA classes:1. baidu_slide_rotate2. dingxiang_audio3. dingxiang_click_area4. dingxiang_click_difference5. dingxiang_click_font6. dingxiang_click_icon7. dingxiang_click_vr8. dingxiang_click_word9. dingxiang_drag10. dingxiang_slide_puzzle11. dingxiang_slide_puzzle212. dingxiang_slide_rotate13. geetest_checkbox14. geetest_click_icon15. geetest_click_phrase16. geetest_click_word17. geetest_game_playing18. geetest_game_playing219. geetest_select20. geetest_slide_puzzle21. hcaptcha22. hcaptcha_checkbox23. netease_click_icon24. netease_click_phrase25. netease_click_vr26. netease_click_word27. netease_drag28. netease_slide29. press_and_hold30. recaptchav231. recaptchav2_checkbox32. tencent_slide33. text_134. text_235. text_336. text_437. text_538. text_6CAPTCHA Open-set DatasetSection: 5.2.2Description: For testing the CAPTCHA detection and recognition pipeline.Contents: 1,100 webpage screenshots (1920×1080), all of which have annotated CAPTCHA classes spanning 11 different categories.Format: PNG CAPTCHA and screenshot images with their corresponding class labels in CSVCAPTCHA classes:1. arkose_select_22. capycaptcha_drag3. dicecaptcha_qa4. funcaptcha_select5. funcaptcha_select_26. funcaptcha_select_37. funcaptcha_select_48. funcaptcha_select_59. funcaptcha_select_610. keycaptcha_drag11. mtcaptcha_textAblation DatasetSection: 5.4Description: For training the CAPTCHA recognition modelContents: 722 webpage screenshots (1920×1080), 622 with CAPTCHAs spanning 38 classes, 100 without.Format: PNG images with their corresponding bounding box and class labels in CSV. Class IDs 0-37 can be directly mapped to class names in CAPTCHA recognition dataset. Class ID 38 are samples without CAPTCHAs.

Authors

  • Dong, Jin Song ;
  • Teoh, Xiwen ;
  • Lin, Yun ;
  • Liu, Ruofan ;
  • Huang, Zhiyong
0 Citations0 Mentions73% FAIR1.6 Dataset Index
10.5281/zenodo.11228974May 2024