The BlueJ Blackbox data collection project is an initiative by the
developers of BlueJ to collect data on how BlueJ is used, in order to
increase understanding of how students learn to program. The data
collected is for the purposes of academic research, and will only be
used by computing education researchers.
Access to Data
For access to the data, contact
What data will be sent?
Frequently Asked Questions:
Who will the data be sent to?
The main data that will be sent is the
) source code from your projects. We
also record the use of the BlueJ interface: for example, which methods
are invoked, use of the codepad, use of the debugger, and how you use
other features of BlueJ. No identifying information (e.g. username)
will be sent with the data.
The data will be sent to a server hosted at the University of Kent in the UK.
Researchers at the University of Kent will have access to the data in order to
analyse it, and access will also be provided to other recognised computing
education researchers for the purpose of analysis.
How will the data be anonymised?
No identifying information (e.g. username, machine name) is sent to the
server. The name of the project is sent, but the full path (which probably
contains your username) is not. Source code is sent, but all comments before
the class begins are blanked out -- that is, the top comment before your class
will be blanked, as that typically contains your name. All other code (and
comments) are sent to the server.
How much traffic will this
The exact rate at which data will be sent is dependent on the actions
you are performing and the size of your source code base. As an
estimate, we believe for a handful of small classes (e.g. the projects
accompanying the textbook) that the upload will
be around 3-4 Megabytes each hour of continued use, and the download
will be around 1 Megabyte each hour. As a quick point of comparison,
loading the BBC news front page once involves downloading around 0.5
Megabytes of data and uploading 0.06 Megabytes.
How can I opt in/out?
To change your participation in this research, in BlueJ 3.1.0 and later, go to
the Preferences window, and under the Miscellaneous tab there is an option to
change your current participation.
Why do I repeatedly get asked if
I want to opt in?
I'm a network administrator; how I do
disable participation for my users?
Your participation status is stored in BlueJ's properties file. This is
stored in your user profile directory on your machine. For a home machine, or
a school network which supports persistent profiles, you should be asked once,
and this decision stored thereafter.
However if your network does not keep your profile, you will be asked every
time you load BlueJ, because BlueJ cannot tell that you have been asked
before. In this case, you will need to contact your network administrator
and tell them to either let profiles persist (the ideal solution), or
otherwise to alter the bluej.defs file supplied with BlueJ to include the
My question isn't
If you want to opt-out your users by default, you can alter the bluej.defs file that is
installed with BlueJ to include the line:
If you are having a technical problem with BlueJ, even if it is
related to the Blackbox project, please contact us
via our standard support form
. If you have a question about the
research side of the Blackbox project, you can contact us
List of Publications
Becker, B. A., Murray, C., Tao, T., Song, C., McCartney, R., and Sanders, K., Fix the First, Ignore the Rest: Dealing with Multiple Compiler Error Messages,
Proceedings of the 49th ACM Technical Symposium on Computer Science Education (SIGCSE '18). ACM, New York, NY, USA, 634-639, 2018.
Mirza, O. M., Joy, M., and Cosma, G., Suitability of BlackBox dataset for style
analysis in detection of source code plagiarism,
Seventh International Conference on
Innovative Computing Technology (INTECH), Luton, pp. 90-94, 2017.
Brown, N. C. C. and Altadmri, A., Novice Java Programming Mistakes: Large-Scale Data vs. Educator Beliefs,
Trans. Comput. Educ. Volume 17, issue 2, Article 7 , 21 pages, 2017.
Keuning, H., Heeren, B., and Jeuring, J., Code quality issues in student programs,
Proceedings of the 2017 ACM Conference on Innovation and Technology in
Computer Science Education, ser. ITiCSE ’17. New York, NY, USA: ACM, pp. 110–115, 2017.
McCall,D., Kölling, M., Meaningful categorisation of novice
Frontiers In Education Conference
, pages 2589-2596, 2014. DOI: https://10.1109/FIE.2014.7044420
Murray, C., A Comparative Study of Java Compiler Error Profiles Using the Blackbox Dataset,
Master's thesis, University College Dublin, 2016.
Altadmri, A., and Brown, N.C.C., Researching Programming Education with Blackbox (Abstract Only)
, In Proceedings of the 47th ACM Technical Symposium on Computing Science Education (SIGCSE '16),
ACM, New York, NY, USA, 702-702. 2016.
Altadmri, A., and Brown, N., C.C.,. 37 Million Compilations: Investigating Novice Programming Mistakes in Large-Scale Student Data,
In Proceedings of the 46th ACM Technical Symposium on Computer
Science Education (SIGCSE '15), ACM, New York, NY, USA, 522-527, 2015.
Brown, N., C.C., Kölling, M., McCall, D., and Utting, I., Blackbox: a large scale repository of novice programmers' activity,
In Proceedings of the 45th ACM technical symposium on Computer science education (SIGCSE '14), ACM, New York, NY, USA, 223-228, 2014.
Brown, N., C.C. and Altadmri, A.,
Investigating novice programming mistakes: educator beliefs vs. student data.
In Proceedings of the tenth annual conference
on International computing education research (ICER '14),
ACM, New York, NY, USA, 43-50, 2014.
Brown, N., C.C., Introduction to analysing the BlueJ blackbox data (abstract only)
In Proceedings of the 45th ACM technical symposium on Computer science education(SIGCSE '14), ACM, New York, NY, USA, 748-748, 2014.