I made a possible mistake on a live project at work, how to handle this mess?

Question

I am a software developer that just started out in this company. It's been almost three weeks.

On Friday when I was about to leave work, a co-worker stopped me and told me to make changes to already working code.

The fact that this live project would be used over the weekend by a real client in a production environment was not made known to me.

Usually QA tests everything, but strangely enough, no QA was notified and no functional tests where run.

I got a phone call this morning from the project manager who sounds panicked, and asking me what changes I made. I also talked to a senior developer and he's screaming at me that I did not test or fully write it and that he has to make the fixes.

So I speak to the project manager again and he says we will all have a meeting.

If I had known that this was to be run on Sunday live, I would've tested out the code. I do believe I am at fault here and I should probably own up to it. But I feel a bit angered that I listened to a co-worker blindly without consulting the project manager or senior developer. If those modifications suggested by the co-worker hadn't been made, everything would've worked perfectly. He needed those modifications so he could use it in his other project. No ill intention was there, but overall there wasn't the same level of awareness or urgency passed along here.

Now I fear that I may be fired because I have not been at this company long. On Monday, what should I do?

This is my first time working as a developer and while I desperately tried not to make critical mistakes, it happened. Now I am at a loss as to how to deal with this overall situation.

how did the meeting go? It would make an interesting footnote to the question! — Angelo, Commented Apr 25, 2012 at 14:41
Did you deploy the code to production, or just commit changes to the library? What's the company policy on committing to the library? In any company I've worked for, programmers routinely commit code to the library that is not ready for production. At my present job, we commit to the library, then deploy from there to a Dev region where our QA tests, when they're happy it goes to a region where the client tests, and then finally we deploy to production. It's up to the person responsible for deployment to know what should get deployed to production, not the person who committed the code. — Jay, Commented May 24, 2016 at 20:16
How a junior dev was able to push code changes into production is baffling. Why isn't there a code review system with critical reviewers, gating stages in the development pipeline, sign-offs, etc? — Cloud, Commented May 24, 2016 at 20:49
Judging by comments, the issue seems to be somewhat over now, but if you guys do project "Restrospective" or "Lessons Learned" type meetings, this would be a good one to bring up! — komodosp, Commented May 25, 2016 at 15:25

maple_shaft · Accepted Answer · 2012-04-23 12:02:52Z

31

You have culpability here but I would say it is not entirely your fault.

Of course as a developer you should always test your own code before checking in and should NEVER rely upon QA as a test plan. This goes without saying.

On the other hand, I strongly feel that they should take some blame here too:

They did not communicate to you when the code was going to be deployed to live/production/client.
A peer did not review your code changes before going through the QA process. This should be mandatory policy on your team, especially for junior and new developers on the team.
Their QA team should have been more thorough before blindly authorizing the release of software to production.

Don't let them place all the blame on you, own your mistakes but point out the other failure points that led to the situation. If your manager and your company for that matter is a worthwhile place to work for they will be more concerned about resolving the issue than punishing with blame.

We all make mistakes, every one of us.

answered Apr 23, 2012 at 12:02

maple_shaft

20.7k9 gold badges66 silver badges103 bronze badges

2

I admitted not testing the code at the last minute as I was about to leave was my fault and lack of attention to detail. Normally I always get tickets on TRAC from the QA team and since that never happened, I did not think it was possible for this project to be even released in the first place. I was also told that indeed no QA was assigned to this project from the start(!) to my shock and that they should've told me. The senior dev has taken over this project now and no real damage reported but I still feel bad for senior dev now having to pick this up.
– joezlja
Commented Apr 24, 2012 at 5:53
6

@joezlja you did the right thing by owning your mistake but they sound like they have some issues of their own they need to work out. Don't feel bad about the project getting assigned to the senior dev because they probably realize that this project is too "wild west cowboy" to be done safely by a new guy.
– maple_shaft
Commented Apr 24, 2012 at 11:05
1

How do I ensure that people send only non-verbal (eg.e-mail) requests for new changes, so that I everything is clear and upfront and there are no surprises ? People in a real hurry could frown when your ask them to make a formal request instead.
– Steam
Commented Jul 26, 2014 at 15:36
@Steam Let them frown. Will the universe cease to exist because you inconvenienced somebody?
– maple_shaft
Commented Sep 30, 2014 at 10:51
I wouldn't feel bad about the senior dev's having to clean this up. It goes with the territory of allowing less senior developers work on code that's significant enough that they can build the skills necessary to become senior. It's easy to be irritated by the need to do this and to allow that irritation to leak through, but if your senior is worth his salt he knows the deal.
– Amy Blankenship
Commented May 25, 2016 at 15:10

| Show 1 more comment

weronika · Accepted Answer · 2012-04-23 04:18:15Z

39

Definitely own up to your mistake - this is clearly your fault, since you didn't do the tests, and the fact that other people may have been at fault as well is no excuse. (And in any case, maybe they didn't - maybe you should have known the system was going to be live on Sunday and you just didn't check some important resource to find out!)

However, the other important step is to make it clear this won't happen again - not by promising it won't, but by explaining why it happened and what steps you're going to take to prevent it in the future, and, if necessary (which it seems like it is), asking for help. It looks like you were missing some important information - ask for guidelines on how to deal with requests for last-minute changes, where to check to make sure what would be going live when, etc. Again, the point of this is not to put the blame on anyone else, but to figure out exactly what you did wrong or what you should have known and all the things you can do to make better decisions in the future. Hopefully it will become clear that your mistake was due to something that's easily fixable, and everyone will feel reasonably confident that it won't happen again.

The fact that you're new at the job may work in your favor here - new people are expected to make some mistakes and to improve rapidly. If, after working there for a year, you still didn't know enough to keep from causing this kind of issue, it would clearly be a sign of a significant problem, but since you've not been there long enough to be very familiar with how everything works, people may feel that this was a newbie mistake that's unlikely to be repeated.

Again - this is really the most important point here - during the meeting, make it obvious that your priority is preventing the problem from happening again, not avoiding blame, and they may decide that approach makes you a good enough employee to keep.

answered Apr 23, 2012 at 4:18

weronika

3,2623 gold badges22 silver badges28 bronze badges

2

just registered to add this comment: Ask yourself "why?" 5 times. it will help you find the answers to the questions which will be most probably asked you (Why did the sistem break? Because I committed untested code. Why did you commit untested code? Because I ... Go on until you find an answer related to a process you failed to follow rather than a personal mistake you did ("i wasn't careful enough" isn't a good root cause, "i didn't follow correctly the "code-review-test-commit process #QA101" is better). This is a good starting point from where to fix issues.
– STT LCU
Commented Apr 23, 2012 at 8:40
4

the final "why?" should lead to an answer like "i haven't asked written confirmation from my supervisor" or any other thing which involves at least a part of company system or hierarchy (anyway, it is not their fault) Try to not worry too much, if they fire a new employee at his first mistake there are many chances it isn't a nice place where to work 40 hrs a week
– STT LCU
Commented Apr 23, 2012 at 9:25
5

+1 for focusing on addressing the problem rather than addressing the blame.
– Burhan Ali
Commented Apr 23, 2012 at 12:12
1

@STTLCU - yes, exactly that: end up with a reason that's a failure of a process, which you can believably say won't happen again. Of course, there may be multiple answers to some of the "why" questions and more than one failure of process - important to note all of those, too.
– weronika
Commented Apr 23, 2012 at 15:23
+1 for "Own up to the mistake", analyzing the chain of events leading up o it, and taking steps to ensure it never happens again company-wide (i.e. make sure the "co-worker asking you to change production code at 5-o-clock on a Friday" thing never happens again - to anyone)
– voretaq7
Commented Apr 23, 2012 at 19:27

| Show 3 more comments

Linda Botes · Accepted Answer · 2012-04-23 09:04:24Z

Making a mistake as a new employee is truly nerve wrecking, but seeing that it was your first mistake it is unlikely that you will be fired as taking on new employees always implies and investment that they will be reluctant to lose.

I would suggest the following:

Remember to ALWAYS test your code before deploying. This is a rule you should never break.
Find out how change control and the development cycle works at your company, ie: Are informal requests allowed or do they need to be specified and signed off prior to development? When and by whom are changes deployed? When and where does testing take place? If there is no controlled process in place, you can potentially score points on helping them to put this in place ;-) It there is, your employer should have explained this process to you clearly so as to prevent a mistake like this.
Find out what the line of command is. Who has the authority to request and sign off changes? Once you have the answer, NEVER act on change requests from an unauthorized person without checking with the correct authority.
It is not clear from your post where you implemented your changes. Did you deploy them directly to a production server? It is essential to have a staging or development server where developers can test their code prior to deployment to production. To best prevent these kind of mistakes, ideally developers should not have direct access to the production servers and tested code should be deployed to production only after Q&A has signed it off. This should happen as part of a controlled change control process.

Taking responsibility doesn't just mean owning up to your mistake, but also by taking part in the implementation of processes that make things work better.

Good luck with the meeting!

Good answer, and it appears that this is your first ever on any Stack Exchange site, welcome! — yannis, Commented May 4, 2012 at 10:27

kevin cline · Accepted Answer · 2012-04-25 14:23:03Z

This process is seriously broken, and it's not your fault. It is absurd to put code into production with no review of any kind. Ideally the change should be reviewed, and a full regression test should be run.

If time is critical, then the absolute minimum would be for two developers to agree on the fix. In such cases, management should understand that there is a significant chance (> 1%) that the fix will fail.

You weren't trained, and you don't have authority over the code that is in production, or the deployment process. Never take responsibility when you don't have authority. Explain what you were asked to do, and how you responded. A rational response by management would be to analyze the incident, then modify the process to prevent future occurrences. Anything else is irrational. Do not stay in irrational organizations.

On top of everything else, putting code into production on a Friday evening, which means 60+ hours until you come back and can fix it, is bad. — gnasher729, Commented Nov 24, 2014 at 19:32

Community · Accepted Answer · 2020-06-16 10:59:44Z

I had to go back and re-read this to make sure I interpreted it correctly. I am also going to make some inferences.

On Friday when I was about to leave work, a co-worker stopped me and told me to make changes to already working code.

The fact that this live project would be used over the weekend by a real client in a production environment was not made known to me.

So basically you were given an assignment without an explicitly stated deadline, and, not knowing that your check-in was going to end up as part of a production release, you partially completed it Friday with the intent of finishing up after the weekend? Like a good developer, before leaving for the weekend you checked in your progress in case the cleaning crew dumped a mop bucket on your computer? It really doesn't sound like you did anything wrong per se, except that your code ended up as part of a production release. Maybe the senior developer didn't think you would stay around to work on the code, and so they never bothered to tell you about the release (dangerous assumption).

So, operating under the pretenses above, here is how I would approach it.

Be factual and objective

Your initial post is already defensive. "Well, no one told me, and QA normally tests, and.. and.. and.." Nobody cares. Stick to the facts. "I was asked right before I left to work on XYZ. I wasn't aware that there was a production release this weekend and that the code would be included. I checked in some of my changes before I left for the weekend so I didn't lose work, but they were not complete or fully tested."

Accept responsibility, and ask how you could have done better

Responsibility isn't the same as blame. Admit that you checked the code in under false assumptions and that it caused a problem. Then, instead of defending your actions, focus on something far more productive... figuring out how you can prevent it from happening again. "If I run into this situation again, would it be better if I checked my code in as a shelveset or didn't check it in at all until I felt it was production ready? Do we have a separate source control branch that I should be using for partial check-ins so they don't accidentally end up in a production release?"

Listen to the responses

It is amazing that this has to be said, but listen to what the senior developers and project managers are suggesting you do to prevent the issue. People will generally be far more willing to help someone they know is interesting in listening and learning.

This isn't the place to criticize processes

It would be totally unproductive to walk into the meeting and say, "Well, if our branching strategy was better, and we would have followed our standard processes, and..." This just comes across as blaming. It sounds to me like there very well IS an issue with process (considering untested/unapproved code got released into production), but there is a time and place to bring up these suggestions. The system currently in place is at least minimally working, and while a better system might have prevented the issue, try to fully understand the current system before pointing fingers at it.

Have compassion for yourself

You are the new dev. You made a mistake. We all have, and will continue to do so from time to time because we are human. I have found over the years that, unless you do something completely negligent and incompetent, mistakes usually aren't career-ending. You can choose to spend your energy beating yourself up about it, defending your actions, or making it right and figuring out how to do better in the future. Most people will respect a person that says, "Yep, I messed up. I'd like to fix it and I am willing to listen."

Jeremy Stein · Accepted Answer · 2012-06-15 04:17:00Z

4

How did the code get into production? Developer accounts should not have access to deploy code to production. Any system that relies on developers to use good judgment is only going to be as good as the worst judgment. If there is a serious consequence to breaking production (and it sounds like there was), then there should be a serious system to ensure that each release is approved. Not just a process that people follow, but account privileges which require the appropriate person to approve the change. You should not be able to deploy bad code to production even if you wanted to. If you can, then the ultimate issue is that your system is broken.

answered Jun 15, 2012 at 4:17

Jeremy Stein

1413 bronze badges

1

I've never seen bureaucracy, policies and access control successfully substituting common sense, responsibility and good judgement. I've seen the reverse many times though. Just saying.
– pap
Commented Jun 15, 2012 at 6:48
1

@pap: Good point. Perhaps I've overstated my case. I just know that I get really nervous when I have access to production and I much prefer being unable to make that particular mistake. We've all had that uh-oh moment when we realized we were actually on production instead of dev.
– Jeremy Stein
Commented Jun 18, 2012 at 13:19
1

Where I work they have this separation of roles in place, but it's not applied in a very thoughtful or effective way so, most of the time, it actually works against us. While it might be more secure, it makes turnaround time worse, and often makes fixing a problem harder. There has to be a way to apply this in a common sense way, though. Or maybe what we really need is some sort of way to audit every change done.
– Keoma
Commented Nov 6, 2012 at 20:28

Add a comment |

Jay · Accepted Answer · 2016-05-24 20:32:52Z

I agree with a lot of what others have said and there's no point in repeating it. Just a few additional comments:

You should certainly admit to your mistakes. Avoid following such an admission with a "but" followed by anything that sounds like whining or blaming others. That said, don't accept blame for what's not your fault. Avoid naming names unless pressed. Like in this case, I might say, "I was told to make such-and-such change" without saying who told me. Sure, if the boss doesn't know he'll figure it out or ask sooner or later, but you avoid sounding like you're trying to blame others if you do it right.

From your post, it sounds to me like either you were confused about the procedures for committing work to the library with your company, or the company has poor control of this. Do you normally only commit to the library after QA has approved the changes? If not, how does the person responsible for deploying to production know when changes are ready to go? If the answer is, "Well, like, they're just supposed to know, you know" then there's a problem.

If you were told the procedures and you didn't follow them because you were in a hurry to go home or whatever, that's your bad. If you followed the procedures but somebody else deployed prematurely because he wasn't paying attention, that's not your fault at all. If the company has no clear way to tell what should be deployed and what shouldn't, that's the company's problem.

A few folks have made comments about peer reviews and QA approval. That of course depends on the company. In a small company there may be no peer reviews or QA staff. At my previous job I fought for a long time to get an actual QA department, rather than the haphazard testing the company was doing.

But in any case, at what point you commit to the library depends on company procedures. Any place I've worked, the programmer tests on his desktop, then when he's satisfied commits to the library. From there it's committed to regions where QA and/or the user can test. So you can't necessarily say, Don't commit until it passes QA. IMHO it makes good sense to deploy to the QA region from the library, so it has to be in the library before we can QA. But what's a good policy is not the immediate question. The question is what the policy is ata your company, and whether you followed it.

Andrew T Finnell · Accepted Answer · 2012-04-23 03:53:19Z

-1

If I had known that this was to be run on Sunday live, I would've tested out the code.

This is your first mistake. Why is there a pre-condition on you testing your code? You should always test your code and never rely on QA to have your back. That is not QA's job.

Whatever you do, don't say any of what you wrote here in your meeting. This is 100% your fault and the only thing you can do is 100% own up to it. Say you broke it, you did not test it and that you are actively looking into ways to perform unit and integration tests so that this won't happen again.

Here is another way to look at things. One of the big things that Civil Engineers used to grief Software Engineers about is liability and the lack of any licensing. If a Civil Engineer makes a blunder while building a bridge and someone gets killed while driving over it, that Engineer is liable for that mistake. Can you imagine if he went on the news and said "My coworker didn't tell me that cars were going to drive over it during the weekend."

While the Live software might not literally be life and death, to some companies a Live system breaking can be very close to feeling like it.

answered Apr 23, 2012 at 3:53

Andrew T Finnell

1384 bronze badges

14

This is 100% your fault ... and that is 100% wrong. Their development shop is entirely too reckless to not perform peer code review of code and unit tests before anything goes to QA, for not communicating when code will go to production, and for a QA team which failed to catch a major problem before going live. Sure it all started with the developer, but then ANY problems with software can ultimately be traced back to the developer. This is why we such quality measures as code reviews, and QA acceptance testing in the first place.
– maple_shaft
Commented Apr 23, 2012 at 12:07
5

-1 for placing 100% of the fault to the OP. The OP is not a lone operator, he works as part of a team and as such the team is responsible for delivering, not the OP. It really points to a seriously broken process.
– tehnyit
Commented Jun 15, 2012 at 8:38
2

-1: Well-designed systems have backups to prevent upset from inevitable human errors. This system has none.
– kevin cline
Commented Sep 12, 2012 at 15:55
1

"Why is there a pre-condition on you testing your code?" It was Friday evening, he was interrupted on the way out the door. The OP believed that the code was not being used on a live system and the worst thing that would happen if his code didn't work was inconveniencing the other developer until Monday morning. OK maybe he should have run a quick test or something, but "100% your fault" is extremely harsh.
– komodosp
Commented May 25, 2016 at 15:23

Add a comment |

Stack Exchange Network

I made a possible mistake on a live project at work, how to handle this mess?

8 Answers 8

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged
project-management
meetings
projects
.

Linked

Hot Network Questions

I made a possible mistake on a live project at work, how to handle this mess?

8 Answers 8

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged project-managementmeetingsprojects.

Linked

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
project-management
meetings
projects
.